Open vedellpt opened 4 years ago
I have been able to get db, annotate, and sort to work on the toy dataset that you have as an example. I have tried a number of things and have continued to be unsuccessful in getting it a successful run of annotate on my annovar input vcf and I also continued to be unsuccessful in getting a successful run of sort on my vep input vcf. Is there anything different about the way the toy example dataset processing is done compared to that of other datasets?
For my latest effort, I take a vep vcf from the vep web portal and try to use it as input to tapes sort. I get the same error for an ascii, Unicode and Unix formatted input file. It is as shown below. Can you tell me how to get past this? Is there a way to tell it which format it is? It seems that it is unable to figure it out in this function call.
[m4@mf4 tapes 2020-03-23 02:18:27] $ cat /research/tapes_sort_veptest8.err
Traceback (most recent call last):
File "tapes.py", line 355, in
Anyway, congratulation on your Plos Computational Biology publication and on developing this tool. I think it is a nice publication and I think it serves an important purpose.
Anyway, congratulation on your Plos Computational Biology publication and on developing this tool. I think it is a nice publication and I think it serves an important purpose.
I also have some problems with VEP vcf annotation, but as far as it seems that tool does not have a support I'm not sure that it makes sense to open an issue ( (I'd like to be wrong, because there are few options for ACMG assignments :( )
Hi, I found the way to solve this problem:
You will need 2 plugins: dbscSNV and dbNSFP for all annotations, add it to VEP command like this:
--plugin dbNSFP,/path/to/dbNSFP_hg19.gz,gnomAD_genomes_AF,gnomAD_exomes_AF,CADD_phred,FATHMM_converted_rankscore,clinvar_clnsig,clinvar_golden_stars,Interpro_domain,SIFT_score,LRT_pred,MutationTaster_pred,MutationAssessor_pred,FATHMM_pred,PROVEAN_score,MetaSVM_pred,MetaLR_pred,M-CAP_pred,fathmm-MKL_coding_pred,GenoCanyon_score,GERP++_RS \
--plugin dbscSNV,/path/to/dbscSNV1.1_GRCh37.txt.gz
Hi, I found the way to solve this problem:
You will need 2 plugins: dbscSNV and dbNSFP for all annotations, add it to VEP command like this:
--plugin dbNSFP,/path/to/dbNSFP_hg19.gz,gnomAD_genomes_AF,gnomAD_exomes_AF,CADD_phred,FATHMM_converted_rankscore,clinvar_clnsig,clinvar_golden_stars,Interpro_domain,SIFT_score,LRT_pred,MutationTaster_pred,MutationAssessor_pred,FATHMM_pred,PROVEAN_score,MetaSVM_pred,MetaLR_pred,M-CAP_pred,fathmm-MKL_coding_pred,GenoCanyon_score,GERP++_RS \ --plugin dbscSNV,/path/to/dbscSNV1.1_GRCh37.txt.gz
Hi, thanks for posting it. Eventually I've exchanged VEP for OpenCravat and re-write the code of InterVar to make both compatible.
Hello @EugeneEA, I'm working on the variant interpretation. If you can provide the modified code of InterVar it will be helpful.
Hi, I found the way to solve this problem:
You will need 2 plugins: dbscSNV and dbNSFP for all annotations, add it to VEP command like this:
--plugin dbNSFP,/path/to/dbNSFP_hg19.gz,gnomAD_genomes_AF,gnomAD_exomes_AF,CADD_phred,FATHMM_converted_rankscore,clinvar_clnsig,clinvar_golden_stars,Interpro_domain,SIFT_score,LRT_pred,MutationTaster_pred,MutationAssessor_pred,FATHMM_pred,PROVEAN_score,MetaSVM_pred,MetaLR_pred,M-CAP_pred,fathmm-MKL_coding_pred,GenoCanyon_score,GERP++_RS \ --plugin dbscSNV,/path/to/dbscSNV1.1_GRCh37.txt.gz
Hi... I'm running tapes using my VEP output just like you've posted here but, still having isuue with error: All required annotations not found
.
Did you managed to solve the issue??
I am trying to run tapes using a vep vcf as input. The vep vcf that I am using as input contains the ClinVar significance information in the CLIN_SIG entry which is part of the CSQ INFO field which has description "Consequence annotations from Ensembl VEP".
I have process vep vcfs for 37 patients. I know some have some pathogenic variants but none are classified as such by tapes. I think it probably has to do with this message in the log:
"2020-03-20 09:51:55.....PS1 done || No trio data, skipping PS2 || No "clinvar_golden_stars" or "clinvar_clnsig" column found. Please annotate your data with a recent clinvar database || All frequency data not found || No domain data found 2020-03-20 09:51:55.....PM5 done"
I think it could also be related to this message:
"2020-03-20 09:51:55.....Starting... || Cannnot calculate PVS1, no splicing annotation. Please annotate with dbscSNV"
Can you please provide some suggestions on how I can get past these problems? Thanks.