lgmgeo / AnnotSV

Annotation and Ranking of Structural Variation
GNU General Public License v3.0
208 stars 35 forks source link

AnnotSV running endlessly #171

Closed ksarathbabu closed 1 year ago

ksarathbabu commented 1 year ago

Hi Veronique,

Thank you for keeping AnnotSV updated and making it the go to tool for SV annotation. I used V3.1 on my dataset and it worked fine. Saw that new versions were released so tried to use 3.3.2 but it is running endlessly. My SV list a output from XHMM and in vcf format.

Below is the log.

AnnotSV 3.3.2

Copyright (C) 2017-2023 GEOFFROY Veronique

Please feel free to contact me for any suggestions or bug reports
email: veronique.geoffroy@inserm.fr

Tcl/Tk version: 8.6

Application name used (defined with the "ANNOTSV" environment variable):
/Users/ksarathbabu/Desktop/Tools/AnnotSV

...downloading the configuration data (April 08 2023 - 23:10)
    ...configuration data by default
    ...configuration data from /Users/xyz/Desktop/Tools/AnnotSV/etc/AnnotSV/configfile
    ...configuration data given in arguments
    ...checking all these configuration data

...checking the annotation data sources

WARNING: No GeneHancer annotations available.
(Please, see in the README file how to add these annotations. Users need to contact the GeneCards team.)

...listing arguments
    ******************************************
    AnnotSV has been run with these arguments:
    ******************************************
    -REreport 0
    -REselect1 1
    -REselect2 1
    -SVinputFile Trios_IDTv2.vcf
    -SVinputInfo 1
    -SVminSize 50
    -annotationMode full
    -annotationsDir /User/xyz/Desktop/Tools/AnnotSV/share/AnnotSV
    -bcftools bcftools
    -bedtools bedtools
    -benignAF 0.01
    -candidateGenesFiltering 0
    -cytoband 1
    -genomeBuild GRCh37
    -includeCI 1
    -metrics us
    -miRNAann 1
    -minTotalNumber 500
    -organism Human
    -outputDir .
    -outputFile Trios_IDTv2.annotSV_V3.3.2.tsv
    -overlap 100
    -overwrite 1
    -promoterSize 500
    -rankFiltering 1 2 3 4 5 NA
    -reciprocal 0
    -samplesidBEDcol -1
    -snvIndelPASS 0
    -svtBEDcol -1
    -tx RefSeq
    -vcf 0
    ******************************************

...searching for SV overlaps with a gene or a regulatory elements
    ...4828 genes overlapped with an SV
    ...16313 genes regulated by a regulatory element which is overlapped with an SV

...listing of the annotations to be realized (April 08 2023 - 23:13)
    ...CytoBand annotation
    ...Genes annotation
        ...RefSeq annotation
    ...Regulatory elements annotations
        ...Promoter annotations
        ...EnhancerAtlas annotations
    ...Annotations with pathogenic genes or genomic regions
        ...dbVar annotation
        ...ClinVar annotation
        ...ClinGen annotation
    ...Annotations with pathogenic snv/indel
    ...Annotations with benign genes or genomic regions
        ...gnomAD annotation
        ...ClinVar annotation
        ...ClinGen annotation
        ...DGV annotation
        ...DDD annotation
        ...1000g annotation
        ...Ira M. Hall's lab annotation
        ...Children’s Mercy Research Institute
    ...Annotations with features overlapped with the SV (100 %)
        ...TAD annotation
    ...Annotations with features sharing any overlap with the SV
    ...Breakpoints annotations
        ...GC content annotation
        ...Repeat annotation
        ...Gap annotation
        ...Segmental duplication annotation
        ...ENCODE blacklist annotation
    ...Gene-based annotations
        ...20220617_ACMG.tsv
        (78 gene identifiers and 1 annotations columns: ACMG)
        ...20220906_ClinGenAnnotations.tsv
        (1480 gene identifiers and 2 annotations columns: HI, TS)
        ...20200713_HI.tsv.gz
        (19124 gene identifiers and 1 annotations columns: DDD_HI_percent)
        ...20191219_ExAC.CNV-Zscore.annotations.tsv.gz
        (15673 gene identifiers and 3 annotations columns: ExAC_delZ, ExAC_dupZ, ExAC_cnvZ)
        ...20201023_GeneIntolerance-Zscore.annotations.tsv.gz
        (18241 gene identifiers and 2 annotations columns: ExAC_synZ, ExAC_misZ)
        ...20220902_GenCC.tsv
        (4615 gene identifiers and 4 annotations columns: GenCC_disease, GenCC_moi, GenCC_classification, GenCC_pmid)
        ...20220905_OMIM-1-annotations.tsv.gz
        (16250 gene identifiers and 1 annotations columns: OMIM_ID)
        ...20220905_OMIM-2-annotations.tsv.gz
        (16250 gene identifiers and 2 annotations columns: OMIM_phenotype, OMIM_inheritance)
        ...20220905_morbid.tsv.gz
        (12998 gene identifiers and 1 annotations columns: OMIM_morbid)
        ...20220905_morbidCandidate.tsv.gz
        (3467 gene identifiers and 1 annotations columns: OMIM_morbid_candidate)
        ...20201106_gnomAD.LOEUF.pLI.annotations.tsv.gz
        (19451 gene identifiers and 3 annotations columns: LOEUF_bin, GnomAD_pLI, ExAC_pLI)

...annotation in progress (April 08 2023 - 23:13)
ksarathbabu commented 1 year ago

Interestingly the same ones worked well in version3.3 in 2 minutes.

`AnnotSV 3.3

Copyright (C) 2017-2023 GEOFFROY Veronique

Please feel free to contact me for any suggestions or bug reports email: veronique.geoffroy@inserm.fr

Tcl/Tk version: 8.6

Application name used (defined with the "ANNOTSV" environment variable): /Users/xyz/Desktop/Tools/AnnotSV-3.3

...downloading the configuration data (April 09 2023 - 09:05) ...configuration data by default ...configuration data from /Users/xyz/Desktop/Tools/AnnotSV-3.3/etc/AnnotSV/configfile ...configuration data given in arguments ...checking all these configuration data

...checking the annotation data sources

WARNING: No GeneHancer annotations available. (Please, see in the README file how to add these annotations. Users need to contact the GeneCards team.)

...listing arguments


AnnotSV has been run with these arguments:
******************************************
-REreport 0
-REselect1 1
-REselect2 1
-SVinputFile Trios_IDTv2.vcf
-SVinputInfo 1
-SVminSize 50
-annotationMode full
-annotationsDir /Users/xyz/Desktop/Tools/AnnotSV-3.3/share/AnnotSV
-bcftools bcftools
-bedtools bedtools
-benignAF 0.01
-candidateGenesFiltering 0
-cytoband 1
-genomeBuild GRCh37
-includeCI 1
-metrics us
-miRNAann 1
-minTotalNumber 500
-organism Human
-outputDir .
-outputFile Trios_IDTv2.annotSV_V3.3.tsv
-overlap 100
-overwrite 1
-promoterSize 500
-rankFiltering 1 2 3 4 5 NA
-reciprocal 0
-samplesidBEDcol -1
-snvIndelPASS 0
-svtBEDcol -1
-tx RefSeq
-vcf 0
******************************************

...searching for SV overlaps with a gene or a regulatory elements ...4828 genes overlapped with an SV ...16313 genes regulated by a regulatory element which is overlapped with an SV

...listing of the annotations to be realized (April 09 2023 - 09:09) ...CytoBand annotation ...Genes annotation ...RefSeq annotation ...Regulatory elements annotations ...Promoter annotations ...EnhancerAtlas annotations ...Annotations with pathogenic genes or genomic regions ...dbVar annotation ...ClinVar annotation ...ClinGen annotation ...Annotations with pathogenic snv/indel ...Annotations with benign genes or genomic regions ...gnomAD annotation ...ClinVar annotation ...ClinGen annotation ...DGV annotation ...DDD annotation ...1000g annotation ...Ira M. Hall's lab annotation ...Children’s Mercy Research Institute ...Annotations with features overlapped with the SV (100 %) ...TAD annotation ...Annotations with features sharing any overlap with the SV ...Breakpoints annotations ...GC content annotation ...Repeat annotation ...Gap annotation ...Segmental duplication annotation ...ENCODE blacklist annotation ...Gene-based annotations ...20220617_ACMG.tsv (78 gene identifiers and 1 annotations columns: ACMG) ...20220906_ClinGenAnnotations.tsv (1480 gene identifiers and 2 annotations columns: HI, TS) ...20200713_HI.tsv.gz (19124 gene identifiers and 1 annotations columns: DDD_HI_percent) ...20191219_ExAC.CNV-Zscore.annotations.tsv.gz (15673 gene identifiers and 3 annotations columns: ExAC_delZ, ExAC_dupZ, ExAC_cnvZ) ...20201023_GeneIntolerance-Zscore.annotations.tsv.gz (18241 gene identifiers and 2 annotations columns: ExAC_synZ, ExAC_misZ) ...20220902_GenCC.tsv (4615 gene identifiers and 4 annotations columns: GenCC_disease, GenCC_moi, GenCC_classification, GenCC_pmid) ...20220905_OMIM-1-annotations.tsv.gz (16250 gene identifiers and 1 annotations columns: OMIM_ID) ...20220905_OMIM-2-annotations.tsv.gz (16250 gene identifiers and 2 annotations columns: OMIM_phenotype, OMIM_inheritance) ...20220905_morbid.tsv.gz (12998 gene identifiers and 1 annotations columns: OMIM_morbid) ...20220905_morbidCandidate.tsv.gz (3467 gene identifiers and 1 annotations columns: OMIM_morbid_candidate) ...20201106_gnomAD.LOEUF.pLI.annotations.tsv.gz (19451 gene identifiers and 3 annotations columns: LOEUF_bin, GnomAD_pLI, ExAC_pLI)

...annotation in progress (April 09 2023 - 09:09)

...writing of ./Trios_IDTv2.annotSV_V3.3.tsv (April 09 2023 - 09:11)

...output columns annotation (April 09 2023 - 09:11): AnnotSV_ID;SV_chrom;SV_start;SV_end;SV_length;SV_type;Samples_ID;ID;REF;ALT;QUAL;FILTER;INFO;FORMAT;samplesxyz;Annotation_mode;CytoBand;Gene_name;Gene_count;Tx;Tx_start;Tx_end;Overlapped_tx_length;Overlapped_CDS_length;Overlapped_CDS_percent;Frameshift;Exon_count;Location;Location2;Dist_nearest_SS;Nearest_SS_type;Intersect_start;Intersect_end;RE_gene;P_gain_phen;P_gain_hpo;P_gain_source;P_gain_coord;P_loss_phen;P_loss_hpo;P_loss_source;P_loss_coord;P_ins_phen;P_ins_hpo;P_ins_source;P_ins_coord;po_P_gain_phen;po_P_gain_hpo;po_P_gain_source;po_P_gain_coord;po_P_gain_percent;po_P_loss_phen;po_P_loss_hpo;po_P_loss_source;po_P_loss_coord;po_P_loss_percent;P_snvindel_nb;P_snvindel_phen;B_gain_source;B_gain_coord;B_gain_AFmax;B_loss_source;B_loss_coord;B_loss_AFmax;B_ins_source;B_ins_coord;B_ins_AFmax;B_inv_source;B_inv_coord;B_inv_AFmax;po_B_gain_allG_source;po_B_gain_allG_coord;po_B_gain_someG_source;po_B_gain_someG_coord;po_B_loss_allG_source;po_B_loss_allG_coord;po_B_loss_someG_source;po_B_loss_someG_coord;TAD_coordinate;ENCODE_experiment;GC_content_left;GC_content_right;Repeat_coord_left;Repeat_type_left;Repeat_coord_right;Repeat_type_right;Gap_left;Gap_right;SegDup_left;SegDup_right;ENCODE_blacklist_left;ENCODE_blacklist_characteristics_left;ENCODE_blacklist_right;ENCODE_blacklist_characteristics_right;ACMG;HI;TS;DDD_HI_percent;ExAC_delZ;ExAC_dupZ;ExAC_cnvZ;ExAC_synZ;ExAC_misZ;GenCC_disease;GenCC_moi;GenCC_classification;GenCC_pmid;OMIM_ID;OMIM_phenotype;OMIM_inheritance;OMIM_morbid;OMIM_morbid_candidate;LOEUF_bin;GnomAD_pLI;ExAC_pLI;AnnotSV_ranking_score;AnnotSV_ranking_criteria;ACMG_class

...AnnotSV is done with the analysis (April 09 2023 - 09:11)`

lgmgeo commented 1 year ago

Hi @ksarathbabu,

A bug fix is under development on the patch_AnnotSV branch. I should push this to the master branch very soon (today?).

Really sorry for the inconvenience,

Best, Véronique

lgmgeo commented 1 year ago

Fix added with v3.3.4 Sorry for any inconvenience that may have been caused