Closed Stikus closed 4 years ago
Hi,
Thank you for your interest in using AnnotSV.
Annotations are updated and formated specifically for each significant release of AnnotSV (once or twice a year). Next annotations update will be available for v2.4 (not for v2.3.3, it would be too much time consuming for me to do an update as soon as a source annotation is updated). In other words, you have the same annotation data for different realease such as v2.3.2 and v2.3.3.
Regarding your request, I'm not sure to well understand. Could you explain in more details why the use of:
-annotationsDir AnnotSV/2.3.2/
or -annotationsDir AnnotSV/2.3.3/
will not work with the use of Docker?
Best regards, Véronique
Thank you for answer.
Maybe I misunderstood your use of annotation sources. Can we manually update Exomiser
from 1902
in Makefile
to latest 2003
simply by using new data from here?
For now annotations are stored in fixed structure and I assume that we can update them independently - am I wrong?
Or we should use strictly 1902
build of Exomiser
data like stated in Makefile
?
For now, I'm planning to manually download full 20 GB of Exomiser 2003
and use them. Only after implementation of this feature I realised that you're downloading only 2 GB 1902_phenotype.zip
and not full 20 GB 1909_hg19.zip
. Moreover - according to this issue I don't need to download both hg19
and hg38
data - is it correct?
Sorry for some confusion, let's create a list of questions:
Can we use full 20 GB https://data.monarchinitiative.org/exomiser/data/1902_hg19.zip instead of your https://www.lbgi.fr/~geoffroy/Annotations/1902_hg19.tar.gz ? Should we (I assume 'no' here)?
If we want to update from 1902
to 2003
- can we do it ourselves? By using new files from https://data.monarchinitiative.org/exomiser/data ?
When you release AnnotSV 2.4
old annotations will work or not (like Exomiser 1902
data)?
And finally - how to make Exomiser
step work? Before I have its data I got WARNING: No Exomiser annotations available in /ref/AnnotSV/2.3.2/Annotations_Exomiser/
. Now nothing happened but output files are same and I don't see any ..running Exomiser
messages in log like here.
Can we manually update Exomiser from 1902 in Makefile to latest 2003 simply by using new data from here?
Yes you can. You just need to keep the same Exomiser files and hierarchy that in AnnotSV.
For now annotations are stored in fixed structure and I assume that we can update them independently - am I wrong?
If we want to update from 1902 to 2003 - can we do it ourselves? By using new files from https://data.monarchinitiative.org/exomiser/data ?
Yes you can. But I didn't check the latest 2003 data from Exomiser yet. It should work if the format is unchanged. Else, please contact me by email (veronique.geoffroy@inserm.fr) for debugging.
For now, I'm planning to manually download full 20 GB of Exomiser 2003 and use them. Only after implementation of this feature I realised that you're downloading only 2 GB 1902_phenotype.zip and not full 20 GB 1909_hg19.zip. Moreover - according to this issue I don't need to download both hg19 and hg38 data - is it correct?
Absolutely correct, this module takes use of Exomiser (Smedley et al., 2015) and HPO (Köhler et al., 2019) to score genes overlapped with a SV on biological relevance to the individual phenotype. No link with the genome build version.
Can we use full 20 GB https://data.monarchinitiative.org/exomiser/data/1902_hg19.zip instead of your https://www.lbgi.fr/~geoffroy/Annotations/1902_hg19.tar.gz ? Should we (I assume 'no' here)?
No, the full 20 GB can't be used. Only some of these zipped files (9 KB) are needed.
When you release AnnotSV 2.4 old annotations will work or not (like Exomiser 1902 data)?
Theoritically, it should work. Except if the Exomiser format changed. I don't think so.
And finally - how to make Exomiser step work? Before I have its data I got WARNING: No Exomiser annotations available in /ref/AnnotSV/2.3.2/Annotations_Exomiser/. Now nothing happened but output files are same and I don't see any ..running Exomiser messages in log like here.
Do you run the latest version of AnnotSV? (2.3.2) If yes, can you please send me by email the result of the following command lines: echo $ANNOTSV ls $ANNOTSV/share/AnnotSV/Annotations_Exomiser/ ls $ANNOTSV/share/AnnotSV/Annotations_Exomiser/1902/*
Yes, I've found that this block https://github.com/lgmgeo/AnnotSV/blob/master/Makefile#L154 is missing in my installation, thx for pointing out.
I'll report my results tomorrow.
Can you tell me share/AnnotSV/jar/
content should be near Annotations_Exomiser
or $ANNOTSV/share/
?
For now, I have:
root@970129676c3b:/outputs# echo $ANNOTSV
/soft/AnnotSV-2.3.2
root@970129676c3b:/outputs# ls -la $ANNOTSV
total 12
drwxr-xr-x. 5 root root 57 Jun 11 16:14 .
drwxr-xr-x. 1 root root 27 Jun 11 16:14 ..
-rwxr-xr-x. 1 root root 8843 Jun 11 15:32 Makefile
drwxr-xr-x. 2 root root 21 Jun 11 16:14 bin
drwxr-xr-x. 3 root root 21 Jun 11 16:14 etc
drwxr-xr-x. 5 root root 43 Jun 11 16:14 share
root@970129676c3b:/outputs# ls -la $ANNOTSV/share/
total 0
drwxr-xr-x. 5 root root 43 Jun 11 16:14 .
drwxr-xr-x. 5 root root 57 Jun 11 16:14 ..
drwxr-xr-x. 3 root root 21 Jun 11 16:14 bash
drwxr-xr-x. 3 root root 21 Jun 11 16:14 doc
drwxr-xr-x. 3 root root 21 Jun 11 16:14 tcl8.6
root@970129676c3b:/outputs# ls -la /ref/AnnotSV/2.3.2/
total 0
drwxr-xr-x. 4 997 root 59 Jun 15 14:56 .
drwxr-xr-x. 3 997 root 19 Jun 11 16:14 ..
drwxr-xr-x. 4 997 root 30 Jun 15 19:13 Annotations_Exomiser
drwxr-xr-x. 8 997 root 127 Dec 20 14:52 Annotations_Human
Ok, so you use -annotationsDir /ref/AnnotSV/2.3.2/
, right?
So the share/AnnotSV/jar/
content should be in /ref/AnnotSV/2.3.2/
On my installation, without using the -annotationsDir
option, I have:
ls -la $ANNOTSV
total 32
drwxr-xr-x 5 geoffroy lgm 4096 Jun 15 20:14 .
drwxr-xr-x 3 geoffroy lgm 4096 Jun 15 20:13 ..
drwxr-xr-x 2 geoffroy lgm 4096 Jun 15 20:14 bin
drwxr-xr-x 3 geoffroy lgm 4096 Jun 15 20:14 etc
-rwxr-xr-x 1 geoffroy lgm 8843 Jun 15 20:13 Makefile
drwxr-xr-x 6 geoffroy lgm 4096 Jun 15 20:17 share
ls -la $ANNOTSV/share/
total 24
drwxr-xr-x 6 geoffroy lgm 4096 Jun 15 20:17 .
drwxr-xr-x 5 geoffroy lgm 4096 Jun 15 20:14 ..
drwxr-xr-x 5 geoffroy lgm 4096 Jun 15 20:18 AnnotSV
drwxr-xr-x 3 geoffroy lgm 4096 Jun 15 20:14 bash
drwxr-xr-x 3 geoffroy lgm 4096 Jun 15 20:14 doc
drwxr-xr-x 3 geoffroy lgm 4096 Jun 15 20:14 tcl8.6
ls -la $ANNOTSV/share/AnnotSV/
total 20
drwxr-xr-x 5 geoffroy lgm 4096 Jun 15 20:18 .
drwxr-xr-x 6 geoffroy lgm 4096 Jun 15 20:17 ..
drwxr-xr-x 3 geoffroy lgm 4096 Jun 15 20:17 Annotations_Exomiser
drwxr-xr-x 8 geoffroy lgm 4096 Dec 20 12:52 Annotations_Human
drwxr-xr-x 2 geoffroy lgm 4096 Jun 15 20:18 jar
Does it help you to solve the bug?
Ok, so you use
-annotationsDir /ref/AnnotSV/2.3.2/
, right?
Right. Thanks for answer, I'll check. We have serious ISP issues for last week - so even downloading full zip of AnnotSV
is difficult for me now - 90% of the time I get:
Archive: /soft/AnnotSV-4fea16c6f0dcbaedd19ced58c34d22becbcf2b6c.zip
[91m End-of-central-directory signature not found. Either this file is not
a zipfile, or it constitutes one disk of a multi-part archive. In the
latter case the central directory and zipfile comment will be found on
the last disk(s) of this archive.
unzip: cannot find zipfile directory in one of /soft/AnnotSV-4fea16c6f0dcbaedd19ced58c34d22becbcf2b6c.zip or
/soft/AnnotSV-4fea16c6f0dcbaedd19ced58c34d22becbcf2b6c.zip.zip, and cannot find /soft/AnnotSV-4fea16c6f0dcbaedd19ced58c34d22becbcf2b6c.zip.ZIP, period.
But looks like your solution is correct.
And about updating Exomiser
- there are 2 different problems here:
We can use 2003_phenotype.zip instead of 1902_phenotype.zip - this should be trivial (if 1902
is not hardcoded somewhere)
We need updated version of your 1902_hg19.tar.gz for 2003
- correct? Or we don't need to update? If we do - can we do it ourselves or only you can do if with next major release?
We can use 2003_phenotype.zip instead of 1902_phenotype.zip - this should be trivial (if 1902 is not hardcoded somewhere)
Absolutely. It also requires an update of the "etc/AnnotSV/application.properties" file.
We need updated version of your 1902_hg19.tar.gz for 2003 - correct? Or we don't need to update? If we do - can we do it ourselves or only you can do if with next major release?
The Exomiser jar file has a dependency on the genome data. With the help of Jules JACOBSEN (Exomiser developer), AnnotSV hacks it by simply including the 1902_hg19.tar.gz. So, in theory, we only need to change the name of these files.
Just to let you know, an update is planned for this summer. If you can wait for that, it will be easier...
We absolutely can wait, it is just curiosity.
Absolutely. It also requires an update of the "etc/AnnotSV/application.properties" file.
https://github.com/lgmgeo/AnnotSV/blob/master/etc/AnnotSV/application.properties#L29 - this line I assume?
Can we use 2003
phenotype with 1902
hg19? Should we get any benefits? Or just use 1902
until you update tool - what do you think?
Thanks for fast answers and help :)
Looks like something still not working:
Command: '/soft/AnnotSV-2.3.2/bin/AnnotSV -annotationsDir /ref/AnnotSV/2.3.2 -genomeBuild GRCh37 -SVinputFile /inputs/som_candidateSV.vcf -outputFile /outputs/test_som.annotsv.tsv'.
PID=156 (last job)
AnnotSV 2.3.2
Copyright (C) 2017-2019 GEOFFROY Veronique
Please feel free to contact me for any suggestions or bug reports
email: veronique.geoffroy@inserm.fr
Tcl/Tk version: 8.6
Application name used (defined with the "ANNOTSV" environment variable):
/soft/AnnotSV-2.3.2
...downloading the configuration data (June 16 2020 - 00:59)
...configuration data by default
...configuration data from /soft/AnnotSV-2.3.2/etc/AnnotSV/configfile
...configuration data given in arguments
...checking configuration data and files
WARNING: No GeneHancer annotations available.
(Please, see in the README file how to add these annotations. Users need to contact the GeneCards team.)
******************************************
AnnotSV has been run with these arguments:
******************************************
-SVinputFile /inputs/som_candidateSV.vcf
-SVinputInfo 1
-SVminSize 50
-annotationsDir /ref/AnnotSV/2.3.2
-bedtools bedtools
-candidateGenesFiltering no
-genomeBuild GRCh37
-metrics us
-minTotalNumber 500
-organism Human
-outputDir /outputs
-outputFile test_som.annotsv.tsv
-overlap 70
-overwrite yes
-promoterSize 500
-rankFiltering 1 2 3 4 5
-rankOutput no
-reciprocal no
-snvIndelPASS 0
-svtBEDcol -1
******************************************
...listing of the annotations to realized (June 16 2020 - 00:59)
...refGene annotation
(with /ref/AnnotSV/2.3.2/Annotations_Human/RefGene/GRCh37/refGene.sorted.bed)
...Genes-based annotations
...20181211_ACMG.tsv
(59 gene identifiers and 1 annotations columns: ACMG)
...20191219_DDG2P.tsv.gz
(1982 gene identifiers and 5 annotations columns: DDD_status, DDD_mode, DDD_consequence, DDD_disease, DDD_pmids)
...20191219_HI.tsv.gz
(19124 gene identifiers and 1 annotations columns: HI_DDDpercent)
...20191219_GeneIntolerance.pLI-Zscore.annotations.tsv.gz
(18241 gene identifiers and 3 annotations columns: synZ_ExAC, misZ_ExAC, pLI_ExAC)
...20191219_ExAC.CNV-Zscore.annotations.tsv.gz
(15673 gene identifiers and 3 annotations columns: delZ_ExAC, dupZ_ExAC, cnvZ_ExAC)
...20191216_OMIM-1-annotations.tsv.gz
(14411 gene identifiers and 1 annotations columns: Mim Number)
...20191216_morbidGenesCandidates.tsv.gz
(3136 gene identifiers and 1 annotations columns: morbidGenesCandidates)
...20191216_OMIM-2-annotations.tsv.gz
(14411 gene identifiers and 2 annotations columns: Phenotypes, Inheritance)
...20191216_morbidGenes.tsv.gz
(11249 gene identifiers and 1 annotations columns: morbidGenes)
...20191219_ClinGenAnnotations.tsv.gz
(1392 gene identifiers and 2 annotations columns: HI_CGscore, TriS_CGscore)
...Annotations with features overlapping the SV
...DGV Gold Standard frequency annotation
...gnomAD frequency annotation
...DDD frequency annotation
...1000g frequency annotation
...Ira M. Hall's lab frequency annotation
...Annotations with features overlapped with the SV
...Promoters annotation
...dbVar_pathogenic_NR_SV annotation
...TAD annotation
...Breakpoints annotations
...GC content annotation
...Repeat annotation
...annotation in progress (June 16 2020 - 00:59)
...Output columns annotation:
AnnotSV ID; SV chrom; SV start; SV end; SV length; SV type; ID; REF; ALT; QUAL; FILTER; INFO; AnnotSV type; Gene name; NM; CDS length; tx length; location; location2; intersectStart; intersectEnd; DGV_GAIN_IDs; DGV_GAIN_n_samples_with_SV; DGV_GAIN_n_samples_tested; DGV_GAIN_Frequency; DGV_LOSS_IDs; DGV_LOSS_n_samples_with_SV; DGV_LOSS_n_samples_tested; DGV_LOSS_Frequency; GD_ID; GD_AN; GD_N_HET; GD_N_HOMALT; GD_AF; GD_POPMAX_AF; GD_ID_others; DDD_SV; DDD_DUP_n_samples_with_SV; DDD_DUP_Frequency; DDD_DEL_n_samples_with_SV; DDD_DEL_Frequency; 1000g_event; 1000g_AF; 1000g_max_AF; IMH_ID; IMH_AF; IMH_ID_others; promoters; dbVar_event; dbVar_variant; dbVar_status; TADcoordinates; ENCODEexperiments; GCcontent_left; GCcontent_right; Repeats_coord_left; Repeats_type_left; Repeats_coord_right; Repeats_type_right; ACMG; DDD_status; DDD_mode; DDD_consequence; DDD_disease; DDD_pmids; HI_DDDpercent; synZ_ExAC; misZ_ExAC; pLI_ExAC; delZ_ExAC; dupZ_ExAC; cnvZ_ExAC; Mim Number; morbidGenesCandidates; Phenotypes; Inheritance; morbidGenes; HI_CGscore; TriS_CGscore; AnnotSV ranking
...AnnotSV is done with the analysis (June 16 2020 - 00:59)
root@27fe373fffff:/outputs# echo $ANNOTSV
/soft/AnnotSV-2.3.2
root@27fe373fffff:/outputs# ls -la $ANNOTSV
total 12
drwxr-xr-x. 5 root root 57 Jun 16 00:39 .
drwxr-xr-x. 1 root root 27 Jun 16 00:39 ..
-rwxr-xr-x. 1 root root 8843 Jun 11 22:10 Makefile
drwxr-xr-x. 2 root root 21 Jun 16 00:39 bin
drwxr-xr-x. 3 root root 21 Jun 16 00:39 etc
drwxr-xr-x. 5 root root 43 Jun 16 00:39 share
root@27fe373fffff:/outputs# ls -la $ANNOTSV/share/
total 0
drwxr-xr-x. 5 root root 43 Jun 16 00:39 .
drwxr-xr-x. 5 root root 57 Jun 16 00:39 ..
drwxr-xr-x. 3 root root 21 Jun 16 00:39 bash
drwxr-xr-x. 3 root root 21 Jun 16 00:39 doc
drwxr-xr-x. 3 root root 21 Jun 16 00:39 tcl8.6
root@27fe373fffff:/outputs# ls -la $ANNOTSV/etc/AnnotSV/
total 8
drwxr-xr-x. 2 root root 54 Jun 16 00:39 .
drwxr-xr-x. 3 root root 21 Jun 16 00:39 ..
-rw-r--r--. 1 root root 1468 Jun 16 00:39 application.properties
-rwxr-xr-x. 1 root root 2280 Jun 11 22:10 configfile
root@27fe373fffff:/outputs# ls -la /ref/AnnotSV/2.3.2/
total 0
drwxr-xr-x. 5 root root 70 Jun 16 00:53 .
drwxr-xr-x. 3 root root 19 Jun 15 23:22 ..
drwxr-xr-x. 3 root root 18 Jun 15 23:23 Annotations_Exomiser
drwxr-xr-x. 8 3054 3002 127 Dec 20 14:52 Annotations_Human
drwxrwxr-x. 2 root root 50 Jun 11 22:10 jar
root@27fe373fffff:/outputs# ls -la /ref/AnnotSV/2.3.2/Annotations_Exomiser/1902/1902_*
/ref/AnnotSV/2.3.2/Annotations_Exomiser/1902/1902_hg19:
total 114128
drwxr-xr-x. 2 3054 3002 109 Dec 13 2019 .
drwxr-xr-x. 4 root root 45 Jun 16 00:56 ..
-rw-r--r--. 1 3054 3002 65536 Dec 13 2019 1902_hg19_genome.h2.db
-rw-r--r--. 1 3054 3002 116785848 Dec 13 2019 1902_hg19_transcripts_ensembl.ser
-rw-r--r--. 1 3054 3002 12288 Dec 13 2019 1902_hg19_variants.mv.db
/ref/AnnotSV/2.3.2/Annotations_Exomiser/1902/1902_phenotype:
total 8619648
drwxr-xr-x. 3 root root 71 Jun 16 00:54 .
drwxr-xr-x. 4 root root 45 Jun 16 00:56 ..
-rw-r-----. 1 root root 8016973824 Mar 6 2019 1902_phenotype.h2.db
drwxr-xr-x. 2 root root 6 Mar 6 2019 phenix
-rwxr-x---. 1 root root 809545728 Mar 6 2019 rw_string_10.mv
Absolutely. It also requires an update of the "etc/AnnotSV/application.properties" file.
https://github.com/lgmgeo/AnnotSV/blob/master/etc/AnnotSV/application.properties#L29 - this line I assume?
Yes, and the following one: https://github.com/lgmgeo/AnnotSV/blob/master/etc/AnnotSV/application.properties#L30
Looks like something still not working:
You didn't give HPO argument in your command line to describe the phenotype of your patient. Try something like:
/soft/AnnotSV-2.3.2/bin/AnnotSV -annotationsDir /ref/AnnotSV/2.3.2 -genomeBuild GRCh37 -SVinputFile /inputs/som_candidateSV.vcf -outputFile /outputs/test_som.annotsv.tsv -hpo "HP:0001156,HP:0001363,HP:0011304,HP:0010055"
Yes, and the following one: https://github.com/lgmgeo/AnnotSV/blob/master/etc/AnnotSV/application.properties#L30
Of course!
You didn't give HPO argument in your command line to describe the phenotype of your patient. Try something like:
Is it possible not to specify this? In my case we usually do not know the phenotype of the patient and we a looking for the way to annotate the structural variants in details.
If you don't specify HPO terms in the command line, AnnotSV will not use the Exomiser module. This module provides a phenotype-driven analysis. The given score and annotations are specific to a phenotype (to a patient).
For a given phenotype, the HPO-based score corresponding to a damaging probability is provided for each gene overlapped with an SV so that:
Is it possible to provide all phenotypes at once?
You didn't give HPO argument in your command line to describe the phenotype of your patient. Try something like:
/soft/AnnotSV-2.3.2/bin/AnnotSV -annotationsDir /ref/AnnotSV/2.3.2 -genomeBuild GRCh37 -SVinputFile /inputs/som_candidateSV.vcf -outputFile /outputs/test_som.annotsv.tsv -hpo "HP:0001156,HP:0001363,HP:0011304,HP:0010055"
After your addition Exomiser
start working, but only for one of my test files:
Command: '/soft/AnnotSV-2.3.2/bin/AnnotSV -annotationsDir /ref/AnnotSV/2.3.2 -genomeBuild GRCh37 -SVinputFile /inputs/germ_candidateSV.vcf -outputFile /outputs/test_germ.annotsv.tsv -hpo HP:0001156,HP:0001363,HP:0011304,HP:0010055'.
PID=156 (last job)
AnnotSV 2.3.2
Copyright (C) 2017-2019 GEOFFROY Veronique
Please feel free to contact me for any suggestions or bug reports
email: veronique.geoffroy@inserm.fr
Tcl/Tk version: 8.6
Application name used (defined with the "ANNOTSV" environment variable):
/soft/AnnotSV-2.3.2
...downloading the configuration data (June 16 2020 - 10:02)
...configuration data by default
...configuration data from /soft/AnnotSV-2.3.2/etc/AnnotSV/configfile
...configuration data given in arguments
...checking configuration data and files
WARNING: No GeneHancer annotations available.
(Please, see in the README file how to add these annotations. Users need to contact the GeneCards team.)
INFO: AnnotSV takes use of Exomiser (Smedley et al., 2015) for the phenotype-driven analysis.
INFO: AnnotSV is using the Human Phenotype Ontology (version 1902). Find out more at http://www.human-phenotype-ontology.org
******************************************
AnnotSV has been run with these arguments:
******************************************
-SVinputFile /inputs/germ_candidateSV.vcf
-SVinputInfo 1
-SVminSize 50
-annotationsDir /ref/AnnotSV/2.3.2
-bedtools bedtools
-candidateGenesFiltering no
-genomeBuild GRCh37
-hpo HP:0001156,HP:0001363,HP:0011304,HP:0010055
-metrics us
-minTotalNumber 500
-organism Human
-outputDir /outputs
-outputFile test_germ.annotsv.tsv
-overlap 70
-overwrite yes
-promoterSize 500
-rankFiltering 1 2 3 4 5
-rankOutput no
-reciprocal no
-snvIndelPASS 0
-svtBEDcol -1
******************************************
no intersection between SV and gene annotation
...listing of the annotations to realized (June 16 2020 - 10:02)
...refGene annotation
(with /ref/AnnotSV/2.3.2/Annotations_Human/RefGene/GRCh37/refGene.sorted.bed)
...Genes-based annotations
...20181211_ACMG.tsv
(59 gene identifiers and 1 annotations columns: ACMG)
...20191219_DDG2P.tsv.gz
(1982 gene identifiers and 5 annotations columns: DDD_status, DDD_mode, DDD_consequence, DDD_disease, DDD_pmids)
...20191219_HI.tsv.gz
(19124 gene identifiers and 1 annotations columns: HI_DDDpercent)
...20191219_GeneIntolerance.pLI-Zscore.annotations.tsv.gz
(18241 gene identifiers and 3 annotations columns: synZ_ExAC, misZ_ExAC, pLI_ExAC)
...20191219_ExAC.CNV-Zscore.annotations.tsv.gz
(15673 gene identifiers and 3 annotations columns: delZ_ExAC, dupZ_ExAC, cnvZ_ExAC)
...20191216_OMIM-1-annotations.tsv.gz
(14411 gene identifiers and 1 annotations columns: Mim Number)
...20191216_morbidGenesCandidates.tsv.gz
(3136 gene identifiers and 1 annotations columns: morbidGenesCandidates)
...20191216_OMIM-2-annotations.tsv.gz
(14411 gene identifiers and 2 annotations columns: Phenotypes, Inheritance)
...20191216_morbidGenes.tsv.gz
(11249 gene identifiers and 1 annotations columns: morbidGenes)
...20191219_ClinGenAnnotations.tsv.gz
(1392 gene identifiers and 2 annotations columns: HI_CGscore, TriS_CGscore)
...Annotations with features overlapping the SV
...DGV Gold Standard frequency annotation
...gnomAD frequency annotation
...DDD frequency annotation
...1000g frequency annotation
...Ira M. Hall's lab frequency annotation
...Annotations with features overlapped with the SV
...Promoters annotation
...dbVar_pathogenic_NR_SV annotation
...TAD annotation
...Breakpoints annotations
...GC content annotation
...Repeat annotation
...annotation in progress (June 16 2020 - 10:02)
...Output columns annotation:
AnnotSV ID; SV chrom; SV start; SV end; SV length; SV type; ID; REF; ALT; QUAL; FILTER; INFO; AnnotSV type; Gene name; NM; CDS length; tx length; location; location2; intersectStart; intersectEnd; DGV_GAIN_IDs; DGV_GAIN_n_samples_with_SV; DGV_GAIN_n_samples_tested; DGV_GAIN_Frequency; DGV_LOSS_IDs; DGV_LOSS_n_samples_with_SV; DGV_LOSS_n_samples_tested; DGV_LOSS_Frequency; GD_ID; GD_AN; GD_N_HET; GD_N_HOMALT; GD_AF; GD_POPMAX_AF; GD_ID_others; DDD_SV; DDD_DUP_n_samples_with_SV; DDD_DUP_Frequency; DDD_DEL_n_samples_with_SV; DDD_DEL_Frequency; 1000g_event; 1000g_AF; 1000g_max_AF; IMH_ID; IMH_AF; IMH_ID_others; promoters; dbVar_event; dbVar_variant; dbVar_status; TADcoordinates; ENCODEexperiments; GCcontent_left; GCcontent_right; Repeats_coord_left; Repeats_type_left; Repeats_coord_right; Repeats_type_right; ACMG; DDD_status; DDD_mode; DDD_consequence; DDD_disease; DDD_pmids; HI_DDDpercent; synZ_ExAC; misZ_ExAC; pLI_ExAC; delZ_ExAC; dupZ_ExAC; cnvZ_ExAC; Mim Number; morbidGenesCandidates; Phenotypes; Inheritance; morbidGenes; HI_CGscore; TriS_CGscore; AnnotSV ranking
...AnnotSV is done with the analysis (June 16 2020 - 10:02)
Command: '/soft/AnnotSV-2.3.2/bin/AnnotSV -annotationsDir /ref/AnnotSV/2.3.2 -genomeBuild GRCh37 -SVinputFile /inputs/som_candidateSV.vcf -outputFile /outputs/test_som.annotsv.tsv -hpo HP:0001156,HP:0001363,HP:0011304,HP:0010055'.
PID=156 (last job)
AnnotSV 2.3.2
Copyright (C) 2017-2019 GEOFFROY Veronique
Please feel free to contact me for any suggestions or bug reports
email: veronique.geoffroy@inserm.fr
Tcl/Tk version: 8.6
Application name used (defined with the "ANNOTSV" environment variable):
/soft/AnnotSV-2.3.2
...downloading the configuration data (June 16 2020 - 10:02)
...configuration data by default
...configuration data from /soft/AnnotSV-2.3.2/etc/AnnotSV/configfile
...configuration data given in arguments
...checking configuration data and files
WARNING: No GeneHancer annotations available.
(Please, see in the README file how to add these annotations. Users need to contact the GeneCards team.)
INFO: AnnotSV takes use of Exomiser (Smedley et al., 2015) for the phenotype-driven analysis.
INFO: AnnotSV is using the Human Phenotype Ontology (version 1902). Find out more at http://www.human-phenotype-ontology.org
******************************************
AnnotSV has been run with these arguments:
******************************************
-SVinputFile /inputs/som_candidateSV.vcf
-SVinputInfo 1
-SVminSize 50
-annotationsDir /ref/AnnotSV/2.3.2
-bedtools bedtools
-candidateGenesFiltering no
-genomeBuild GRCh37
-hpo HP:0001156,HP:0001363,HP:0011304,HP:0010055
-metrics us
-minTotalNumber 500
-organism Human
-outputDir /outputs
-outputFile test_som.annotsv.tsv
-overlap 70
-overwrite yes
-promoterSize 500
-rankFiltering 1 2 3 4 5
-rankOutput no
-reciprocal no
-snvIndelPASS 0
-svtBEDcol -1
******************************************
...running Exomiser
...on port 50000
...starting the REST service
...idService = 177
...listing of the annotations to realized (June 16 2020 - 10:03)
...refGene annotation
(with /ref/AnnotSV/2.3.2/Annotations_Human/RefGene/GRCh37/refGene.sorted.bed)
...Genes-based annotations
...20181211_ACMG.tsv
(59 gene identifiers and 1 annotations columns: ACMG)
...20191219_DDG2P.tsv.gz
(1982 gene identifiers and 5 annotations columns: DDD_status, DDD_mode, DDD_consequence, DDD_disease, DDD_pmids)
...20191219_HI.tsv.gz
(19124 gene identifiers and 1 annotations columns: HI_DDDpercent)
...20191219_GeneIntolerance.pLI-Zscore.annotations.tsv.gz
(18241 gene identifiers and 3 annotations columns: synZ_ExAC, misZ_ExAC, pLI_ExAC)
...20191219_ExAC.CNV-Zscore.annotations.tsv.gz
(15673 gene identifiers and 3 annotations columns: delZ_ExAC, dupZ_ExAC, cnvZ_ExAC)
...20191216_OMIM-1-annotations.tsv.gz
(14411 gene identifiers and 1 annotations columns: Mim Number)
...20191216_morbidGenesCandidates.tsv.gz
(3136 gene identifiers and 1 annotations columns: morbidGenesCandidates)
...20191216_OMIM-2-annotations.tsv.gz
(14411 gene identifiers and 2 annotations columns: Phenotypes, Inheritance)
...20191216_morbidGenes.tsv.gz
(11249 gene identifiers and 1 annotations columns: morbidGenes)
...20191219_ClinGenAnnotations.tsv.gz
(1392 gene identifiers and 2 annotations columns: HI_CGscore, TriS_CGscore)
...20200616-100245_exomiser_gene_pheno.tmp.tsv
(1 gene identifiers and 4 annotations columns: EXOMISER_GENE_PHENO_SCORE, HUMAN_PHENO_EVIDENCE, MOUSE_PHENO_EVIDENCE, FISH_PHENO_EVIDENCE)
...Annotations with features overlapping the SV
...DGV Gold Standard frequency annotation
...gnomAD frequency annotation
...DDD frequency annotation
...1000g frequency annotation
...Ira M. Hall's lab frequency annotation
...Annotations with features overlapped with the SV
...Promoters annotation
...dbVar_pathogenic_NR_SV annotation
...TAD annotation
...Breakpoints annotations
...GC content annotation
...Repeat annotation
...annotation in progress (June 16 2020 - 10:03)
...Output columns annotation:
AnnotSV ID; SV chrom; SV start; SV end; SV length; SV type; ID; REF; ALT; QUAL; FILTER; INFO; AnnotSV type; Gene name; NM; CDS length; tx length; location; location2; intersectStart; intersectEnd; DGV_GAIN_IDs; DGV_GAIN_n_samples_with_SV; DGV_GAIN_n_samples_tested; DGV_GAIN_Frequency; DGV_LOSS_IDs; DGV_LOSS_n_samples_with_SV; DGV_LOSS_n_samples_tested; DGV_LOSS_Frequency; GD_ID; GD_AN; GD_N_HET; GD_N_HOMALT; GD_AF; GD_POPMAX_AF; GD_ID_others; DDD_SV; DDD_DUP_n_samples_with_SV; DDD_DUP_Frequency; DDD_DEL_n_samples_with_SV; DDD_DEL_Frequency; 1000g_event; 1000g_AF; 1000g_max_AF; IMH_ID; IMH_AF; IMH_ID_others; promoters; dbVar_event; dbVar_variant; dbVar_status; TADcoordinates; ENCODEexperiments; GCcontent_left; GCcontent_right; Repeats_coord_left; Repeats_type_left; Repeats_coord_right; Repeats_type_right; ACMG; DDD_status; DDD_mode; DDD_consequence; DDD_disease; DDD_pmids; HI_DDDpercent; synZ_ExAC; misZ_ExAC; pLI_ExAC; delZ_ExAC; dupZ_ExAC; cnvZ_ExAC; Mim Number; morbidGenesCandidates; Phenotypes; Inheritance; morbidGenes; HI_CGscore; TriS_CGscore; EXOMISER_GENE_PHENO_SCORE; HUMAN_PHENO_EVIDENCE; MOUSE_PHENO_EVIDENCE; FISH_PHENO_EVIDENCE; AnnotSV ranking
...AnnotSV is done with the analysis (June 16 2020 - 10:03)
Is no intersection between SV and gene annotation
line remnant of Exomiser
step?
Is it possible to provide all phenotypes at once?
Sorry, totally impossible :o) And it would make no sense to me, there are too many phenotypes (or combination of different phenotypes) possible. I can't even imagine this number (hundreds of thousands?). Moreover, it would be unreadable...
Is no intersection between SV and gene annotation line remnant of Exomiser step?
Absolutely. If there is no overlapped gene, you could not have exomiser score linked to a gene
Thank you for your answers, Exomiser finally working.
Manual update will wait until you update your part of data.
Closing.
Hello.
We are trying to use external reference directories for your tool (we are using Docker for tools and we down want to store 20GB reference in it), tool have support for it -
-annotationsDir
flag. But for now all annotations are stored in one place - both ownAnnotSV
andExomiser
(for us). Problem will emerge in futureAnnotSV
release - for now (2.3.2 AnnotSV
) annotations are stored in2.3.2/Annotations_Human
and2.3.2/Annotations_Exomiser/1902/1902_hg19
.Exomiser
update will not cause problem:2.3.2/Annotations_Exomiser/2003/2003_hg19
.AnnotSV
version update will cause all other annotations to be placed into directories like2.3.3/Annotations_Exomiser/2003/2003_hg19
for example, but files can be unchanged.Implementing
2.3.2
directory (or something pointing current release version) intoAnnotations_Human
will fix this problem.