WGLab / doc-ANNOVAR

Documentation for the ANNOVAR software
http://annovar.openbioinformatics.org
234 stars 359 forks source link

Errors of table_annovar.pl #131

Open xiw588 opened 3 years ago

xiw588 commented 3 years ago

Dear Dr. Wang,

I encountered an issue when running table_annovar.pl in terminal.

perl table_annovar.pl 0fb608c4-5f79-4fad-a06e-4fa00ec3bfa0_impute-vcf-merged.vcf.gz humandb/ -buildver hg19 -out m -remove -protocol refGene,cytoBand,exac03,avsnp147,dbnsfp30a -operation gx,r,f,f,f -nastring . -vcfinput -polish -xreffile example/gene_fullxref.txt

The error message is like this Error: the last column in header row should start with 'Otherinfo'

And the header of my vcf file is like below, which is the file format of VCFv4.2

CHROM POS ID REF ALT QUAL FILTER INFO FORMAT

Can you help me fix this issue?

Thank you very much for your time and help.

kaichop commented 3 years ago

Please show the complete error message after running your command.

On Tue, Apr 20, 2021 at 10:24 AM xiw588 @.***> wrote:

Dear Dr. Wang,

I encountered an issue when running table_annovar.pl in terminal.

perl table_annovar.pl 0fb608c4-5f79-4fad-a06e-4fa00ec3bfa0_impute-vcf-merged.vcf.gz humandb/ -buildver hg19 -out m -remove -protocol refGene,cytoBand,exac03,avsnp147,dbnsfp30a -operation gx,r,f,f,f -nastring . -vcfinput -polish -xreffile example/gene_fullxref.txt

The error message is like this Error: the last column in header row should start with 'Otherinfo'

And the header of my vcf file is like below, which is the file format of VCFv4.2

CHROM POS ID REF ALT QUAL FILTER INFO FORMAT

Can you help me fix this issue?

Thank you very much for your time and help.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/WGLab/doc-ANNOVAR/issues/131, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABNG3OCKPVVSIJOYTN3SI3TTJWFABANCNFSM43II2ZOQ .

xiw588 commented 3 years ago

Thank you Dr. Wang for your prompt response! Here I attached the complete error message.

NOTICE: the --polish argument is set ON automatically (use --nopolish to change this behavior)

NOTICE: Running with system command <convert2annovar.pl -includeinfo -allsample -withfreq -format vcf4 n/holyscratch01/christiani_lab/Everyone/LP-WGS/VCF/0fb608c4-5f79-4fad-a06e-4fa00ec3bfa0_impute-vcf-merged.vcf.gz > m.avinput> gzip: n/holyscratch01/christiani_lab/Everyone/LP-WGS/VCF/0fb608c4-5f79-4fad-a06e-4fa00ec3bfa0_impute-vcf-merged.vcf.gz: No such file or directory NOTICE: Finished reading 0 lines from VCF file NOTICE: A total of 0 locus in VCF file passed QC threshold, representing 0 SNPs (0 transitions and 0 transversions) and 0 indels/substitutions NOTICE: Finished writing allele frequencies based on 0 SNP genotypes (0 transitions and 0 transversions) and 0 indels/substitutions for 0 samples

NOTICE: Running with system command <table_annovar.pl m.avinput humandb/ -buildver hg38 -outfile m -remove -protocol refGene,avsnp150,dbnsfp41c,clinvar_20210123 -operation gx,f,f,f -Otherinfo -nastring . -xreffile example/gene_fullxref.txt -otherinfo> NOTICE: the --polish argument is set ON automatically (use --nopolish to change this behavior)

NOTICE: Processing operation=gx protocol=refGene

NOTICE: Running with system command <annotate_variation.pl -geneanno -buildver hg38 -dbtype refGene -outfile m.refGene -exonsort -nofirstcodondel m.avinput humandb/> NOTICE: Output files are written to m.refGene.variant_function, m.refGene.exonic_variant_function NOTICE: Reading gene annotation from humandb/hg38_refGene.txt ... Done with 82500 transcripts (including 20366 without coding sequence annotation) for 28265 unique genes

NOTICE: Running with system command <coding_change.pl m.refGene.exonic_variant_function.orig humandb//hg38_refGene.txt humandb//hg38_refGeneMrna.fa -alltranscript -out m.refGene.fa -newevf m.refGene.exonic_variant_function> NOTICE: The xrefkey is provided in header as <pLi pRec pNull Gene_full_name Function_description Disease_description Tissue_specificity(Uniprot) Expression(egenetics) Expression(GNF/Atlas) P(HI) P(rec) RVIS RVIS_percentile GDI GDI-Phred> NOTICE: Finished reading 597255 cross references (each with 15 fields) from example/gene_fullxref.txt

NOTICE: Processing operation=f protocol=avsnp150

NOTICE: Running system command <annotate_variation.pl -filter -dbtype avsnp150 -buildver hg38 -outfile m m.avinput humandb/> NOTICE: Output file with variants matching filtering criteria is written to m.hg38_avsnp150_dropped, and output file with other variants is written to m.hg38_avsnp150_filtered

NOTICE: Processing operation=f protocol=dbnsfp41c NOTICE: Finished reading 43 column headers for '-dbtype dbnsfp41c'

NOTICE: Running system command <annotate_variation.pl -filter -dbtype dbnsfp41c -buildver hg38 -outfile m m.avinput humandb/ -otherinfo> NOTICE: Output file with variants matching filtering criteria is written to m.hg38_dbnsfp41c_dropped, and output file with other variants is written to m.hg38_dbnsfp41c_filtered

NOTICE: Processing operation=f protocol=clinvar_20210123 NOTICE: Finished reading 5 column headers for '-dbtype clinvar_20210123'

NOTICE: Running system command <annotate_variation.pl -filter -dbtype clinvar_20210123 -buildver hg38 -outfile m m.avinput humandb/ -otherinfo> NOTICE: the --dbtype clinvar_20210123 is assumed to be in generic ANNOVAR database format NOTICE: Output file with variants matching filtering criteria is written to m.hg38_clinvar_20210123_dropped, and output file with other variants is written to m.hg38_clinvar_20210123_filtered

NOTICE: Multianno output file is written to m.hg38_multianno.txt NOTICE: Reading from m.hg38multianno.txt Use of uninitialized value $ in substitution (s///) at tableannovar.pl line 118. Use of uninitialized value $ in substitution (s///) at tableannovar.pl line 119. Use of uninitialized value $ in split at table_annovar.pl line 120. Use of uninitialized value within @name in pattern match (m//) at table_annovar.pl line 121. Error: the last column in header row should start with 'Otherinfo'

kaichop commented 3 years ago

The main red flag that caught my eye is that there is 0 lines in the file because the gz file does not exist. I think you probably has an extra "n/" in the file path name.

gzip: n/holyscratch01/christiani_lab/Everyone/LP-WGS/VCF/0fb608c4-5f79-4fad-a06e-4fa00ec3bfa0_impute-vcf-merged.vcf.gz: No such file or directory NOTICE: Finished reading 0 lines from VCF file

On Tue, Apr 20, 2021 at 11:53 AM xiw588 @.***> wrote:

Thank you Dr. Wang for your prompt response! Here I attached the complete error message.

NOTICE: the --polish argument is set ON automatically (use --nopolish to change this behavior)

NOTICE: Running with system command <convert2annovar.pl -includeinfo -allsample -withfreq -format vcf4 n/holyscratch01/christiani_lab/Everyone/LP-WGS/VCF/0fb608c4-5f79-4fad-a06e-4fa00ec3bfa0_impute-vcf-merged.vcf.gz

m.avinput> gzip: n/holyscratch01/christiani_lab/Everyone/LP-WGS/VCF/0fb608c4-5f79-4fad-a06e-4fa00ec3bfa0_impute-vcf-merged.vcf.gz: No such file or directory NOTICE: Finished reading 0 lines from VCF file NOTICE: A total of 0 locus in VCF file passed QC threshold, representing 0 SNPs (0 transitions and 0 transversions) and 0 indels/substitutions NOTICE: Finished writing allele frequencies based on 0 SNP genotypes (0 transitions and 0 transversions) and 0 indels/substitutions for 0 samples NOTICE: Running with system command <table_annovar.pl m.avinput humandb/ -buildver hg38 -outfile m -remove -protocol refGene,avsnp150,dbnsfp41c,clinvar_20210123 -operation gx,f,f,f -Otherinfo -nastring . -xreffile example/gene_fullxref.txt -otherinfo> NOTICE: the --polish argument is set ON automatically (use --nopolish to change this behavior)

NOTICE: Processing operation=gx protocol=refGene

NOTICE: Running with system command <annotate_variation.pl -geneanno -buildver hg38 -dbtype refGene -outfile m.refGene -exonsort -nofirstcodondel m.avinput humandb/> NOTICE: Output files are written to m.refGene.variant_function, m.refGene.exonic_variant_function NOTICE: Reading gene annotation from humandb/hg38_refGene.txt ... Done with 82500 transcripts (including 20366 without coding sequence annotation) for 28265 unique genes NOTICE: Running with system command <coding_change.pl m.refGene.exonic_variant_function.orig humandb//hg38_refGene.txt humandb//hg38_refGeneMrna.fa -alltranscript -out m.refGene.fa -newevf m.refGene.exonic_variant_function> NOTICE: The xrefkey is provided in header as <pLi pRec pNull Gene_full_name Function_description Disease_description Tissue_specificity(Uniprot) Expression(egenetics) Expression(GNF/Atlas) P(HI) P(rec) RVIS RVIS_percentile GDI GDI-Phred> NOTICE: Finished reading 597255 cross references (each with 15 fields) from example/gene_fullxref.txt

NOTICE: Processing operation=f protocol=avsnp150 NOTICE: Running system command <annotate_variation.pl -filter -dbtype avsnp150 -buildver hg38 -outfile m m.avinput humandb/> NOTICE: Output file with variants matching filtering criteria is written to m.hg38_avsnp150_dropped, and output file with other variants is written to m.hg38_avsnp150_filtered

NOTICE: Processing operation=f protocol=dbnsfp41c NOTICE: Finished reading 43 column headers for '-dbtype dbnsfp41c' NOTICE: Running system command <annotate_variation.pl -filter -dbtype dbnsfp41c -buildver hg38 -outfile m m.avinput humandb/ -otherinfo> NOTICE: Output file with variants matching filtering criteria is written to m.hg38_dbnsfp41c_dropped, and output file with other variants is written to m.hg38_dbnsfp41c_filtered

NOTICE: Processing operation=f protocol=clinvar_20210123 NOTICE: Finished reading 5 column headers for '-dbtype clinvar_20210123' NOTICE: Running system command <annotate_variation.pl -filter -dbtype clinvar_20210123 -buildver hg38 -outfile m m.avinput humandb/ -otherinfo> NOTICE: the --dbtype clinvar_20210123 is assumed to be in generic ANNOVAR database format NOTICE: Output file with variants matching filtering criteria is written to m.hg38_clinvar_20210123_dropped, and output file with other variants is written to m.hg38_clinvar_20210123_filtered

NOTICE: Multianno output file is written to m.hg38_multianno.txt NOTICE: Reading from m.hg38multianno.txt Use of uninitialized value $ in substitution (s///) at tableannovar.pl line 118. Use of uninitialized value $ in substitution (s///) at tableannovar.pl line 119. Use of uninitialized value $ in split at table_annovar.pl line 120. Use of uninitialized value within @name https://github.com/name in pattern match (m//) at table_annovar.pl line 121. Error: the last column in header row should start with 'Otherinfo'

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/WGLab/doc-ANNOVAR/issues/131#issuecomment-823394359, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABNG3OAE5IRSORRCLXDWLL3TJWPONANCNFSM43II2ZOQ .

xiw588 commented 3 years ago

Dear Dr. Wang,

I somehow figured this issue and the vcf file could be converted to the input for annotation. But another error occurred with the following message. Can you please help me look into this?

Thank you!

perl annovar/table_annovar.pl test.avinput annovar/humandb/ -buildver hg38 -out test -remove -protocol refGene,exac03,avsnp150,dbnsfp41c,clinvar_20210123 -operation g,f,f,f,f -nastring . -csvout -polish

NOTICE: Processing operation=g protocol=refGene

NOTICE: Running with system command <annotate_variation.pl -geneanno -buildver hg38 -dbtype refGene -outfile test.refGene -exonsort -nofirstcodondel test.avinput annovar/humandb/> NOTICE: Output files are written to test.refGene.variant_function, test.refGene.exonic_variant_function NOTICE: Reading gene annotation from annovar/humandb/hg38_refGene.txt ... Done with 82500 transcripts (including 20366 without coding sequence annotation) for 28265 unique genes Error running system command: <annotate_variation.pl -geneanno -buildver hg38 -dbtype refGene -outfile test.refGene -exonsort -nofirstcodondel test.avinput annovar/humandb/>

kaichop commented 3 years ago

It is best to use VCF for annotation, since most people do not know how to use convert2annovar correctly depending on the specification of the VCF file.

Also, if there is an error, please show the entire error message, not just part of it.

On Wed, Apr 21, 2021 at 1:00 PM xiw588 @.***> wrote:

Dear Dr. Wang,

I somehow figured this issue and the vcf file could be converted to the input for annotation. But another error occurred with the following message. Can you please help me look into this?

Thank you! perl annovar/table_annovar.pl test.avinput annovar/humandb/ -buildver hg38 -out test -remove -protocol refGene,exac03,avsnp150,dbnsfp41c,clinvar_20210123 -operation g,f,f,f,f -nastring . -csvout -polish

NOTICE: Processing operation=g protocol=refGene

NOTICE: Running with system command <annotate_variation.pl -geneanno -buildver hg38 -dbtype refGene -outfile test.refGene -exonsort -nofirstcodondel test.avinput annovar/humandb/> NOTICE: Output files are written to test.refGene.variant_function, test.refGene.exonic_variant_function NOTICE: Reading gene annotation from annovar/humandb/hg38_refGene.txt ... Done with 82500 transcripts (including 20366 without coding sequence annotation) for 28265 unique genes Error running system command: <annotate_variation.pl -geneanno -buildver hg38 -dbtype refGene -outfile test.refGene -exonsort -nofirstcodondel test.avinput annovar/humandb/>

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/WGLab/doc-ANNOVAR/issues/131#issuecomment-824215770, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABNG3OGLXQP7LQSWUAE6DHTTJ4AD3ANCNFSM43II2ZOQ .

xiw588 commented 3 years ago

Dear Dr Wang, thank you for your prompt response! Actually, I tried to use VCF file as input but it failed due to some reason and that's why I changed to use the converted file as input. Here I attach the command and error messages.

perl Users/xinanwang/annovar/table_annovar.pl Everyone/LP-WGS/VCF/0fb608c4-5f79-4fad-a06e-4fa00ec3bfa0_impute-vcf-merged.vcf.gz Users/xinanwang/annovar/humandb/ -buildver hg38 -out Users/xinanwang/test -remove -protocol refGene,exac03,avsnp150,dbnsfp41c,clinvar_20210123 -operation gx,f,f,f,f -nastring . -vcfinput -polish -xreffile Users/xinanwang/annovar/example/gene_fullxref.txt

NOTICE: Running with system command <convert2annovar.pl -includeinfo -allsample -withfreq -format vcf4 Everyone/LP-WGS/VCF/0fb608c4-5f79-4fad-a06e-4fa00ec3bfa0_impute-vcf-merged.vcf.gz > Users/xinanwang/test.avinput> NOTICE: Finished reading 85199335 lines from VCF file NOTICE: A total of 85199328 locus in VCF file passed QC threshold, representing 81613191 SNPs (55035657 transitions and 26577534 transversions) and 3586133 indels/substitutions NOTICE: Finished writing allele frequencies based on 244839573 SNP genotypes (165106971 transitions and 79732602 transversions) and 10758399 indels/substitutions for 3 samples WARNING: 4 invalid reference alleles found in input file

NOTICE: Running with system command <Users/xinanwang/annovar/table_annovar.pl Users/xinanwang/test.avinput Users/xinanwang/annovar/humandb/ -buildver hg38 -outfile Users/xinanwang/test -remove -protocol refGene,exac03,avsnp150,dbnsfp41c,clinvar_20210123 -operation gx,f,f,f,f -nastring . -polish -xreffile Users/xinanwang/annovar/example/gene_fullxref.txt -otherinfo>

NOTICE: Processing operation=gx protocol=refGene

NOTICE: Running with system command <annotate_variation.pl -geneanno -buildver hg38 -dbtype refGene -outfile Users/xinanwang/test.refGene -exonsort -nofirstcodondel Users/xinanwang/test.avinput Users/xinanwang/annovar/humandb/> NOTICE: Output files are written to Users/xinanwang/test.refGene.variant_function, Users/xinanwang/test.refGene.exonic_variant_function NOTICE: Reading gene annotation from Users/xinanwang/annovar/humandb/hg38_refGene.txt ... Done with 82500 transcripts (including 20366 without coding sequence annotation) for 28265 unique genes Error running system command: <annotate_variation.pl -geneanno -buildver hg38 -dbtype refGene -outfile Users/xinanwang/test.refGene -exonsort -nofirstcodondel Users/xinanwang/test.avinput Users/xinanwang/annovar/humandb/> Error running system command: <Users/xinanwang/annovar/table_annovar.pl Users/xinanwang/test.avinput Users/xinanwang/annovar/humandb/ -buildver hg38 -outfile Users/xinanwang/test -remove -protocol refGene,exac03,avsnp150,dbnsfp41c,clinvar_20210123 -operation gx,f,f,f,f -nastring . -polish -xreffile Users/xinanwang/annovar/example/gene_fullxref.txt -otherinfo>

ysq1770368148 commented 2 years ago

Dear Dr. Wang, I got a .vcf file for annotation and I used convert2annovar.pl to convert the file,this is my code:

perl convert2annovar.pl -format vcf4 -includeinfo -withfreq SRR8991000.g.vcf -o SRR8991000.avinput

But many lines were not correctly converted, can you help me with this problem?

Use of uninitialized value $read_depth in join or string at convert2annovar.pl line 2491, line 713862. Use of uninitialized value $newstart in join or string at convert2annovar.pl line 2491, line 713863. Use of uninitialized value $newend in join or string at convert2annovar.pl line 2491, line 713863. Use of uninitialized value $newref in join or string at convert2annovar.pl line 2491, line 713863. Use of uninitialized value $read_depth in join or string at convert2annovar.pl line 2491, line 713863. Use of uninitialized value $newstart in join or string at convert2annovar.pl line 2491, line 713864. Use of uninitialized value $newend in join or string at convert2annovar.pl line 2491, line 713864. Use of uninitialized value $newref in join or string at convert2annovar.pl line 2491, line 713864. Use of uninitialized value $read_depth in join or string at convert2annovar.pl line 2491, line 713864. Use of uninitialized value $newstart in join or string at convert2annovar.pl line 2491, line 713865. Use of uninitialized value $newend in join or string at convert2annovar.pl line 2491, line 713865. Use of uninitialized value $newref in join or string at convert2annovar.pl line 2491, line 713865. Use of uninitialized value $read_depth in join or string at convert2annovar.pl line 2491, line 713865. Use of uninitialized value $newstart in join or string at convert2annovar.pl line 2491, line 713866. Use of uninitialized value $newend in join or string at convert2annovar.pl line 2491, line 713866. Use of uninitialized value $newref in join or string at convert2annovar.pl line 2491, line 713866. Use of uninitialized value $newend in join or string at convert2annovar.pl line 2491, line 836936. Use of uninitialized value $newend in join or string at convert2annovar.pl line 2491, line 844718. Use of uninitialized value $newref in join or string at convert2annovar.pl line 2491, line 844718. Use of uninitialized value $read_depth in join or string at convert2annovar.pl line 2491, line 844718. Use of uninitialized value $newstart in join or string at convert2annovar.pl line 2491, line 844719. Use of uninitialized value $newend in join or string at convert2annovar.pl line 2491, line 844719. Use of uninitialized value $newref in join or string at convert2annovar.pl line 2491, line 844719. Use of uninitialized value $read_depth in join or string at convert2annovar.pl line 2491, line 844719. Use of uninitialized value $newstart in join or string at convert2annovar.pl line 2491, line 844720. Use of uninitialized value $newend in join or string at convert2annovar.pl line 2491, line 844720. Use of uninitialized value $newref in join or string at convert2annovar.pl line 2491, line 844720. Use of uninitialized value $read_depth in join or string at convert2annovar.pl line 2491, line 844720.

cucearda commented 1 year ago

Dear Dr. Wang, I got a .vcf file for annotation and I used convert2annovar.pl to convert the file,this is my code:

perl convert2annovar.pl -format vcf4 -includeinfo -withfreq SRR8991000.g.vcf -o SRR8991000.avinput

But many lines were not correctly converted, can you help me with this problem?

Use of uninitialized value $read_depth in join or string at convert2annovar.pl line 2491, line 713862. Use of uninitialized value $newstart in join or string at convert2annovar.pl line 2491, line 713863. Use of uninitialized value $newend in join or string at convert2annovar.pl line 2491, line 713863. Use of uninitialized value $newref in join or string at convert2annovar.pl line 2491, line 713863. Use of uninitialized value $read_depth in join or string at convert2annovar.pl line 2491, line 713863. Use of uninitialized value $newstart in join or string at convert2annovar.pl line 2491, line 713864. Use of uninitialized value $newend in join or string at convert2annovar.pl line 2491, line 713864. Use of uninitialized value $newref in join or string at convert2annovar.pl line 2491, line 713864. Use of uninitialized value $read_depth in join or string at convert2annovar.pl line 2491, line 713864. Use of uninitialized value $newstart in join or string at convert2annovar.pl line 2491, line 713865. Use of uninitialized value $newend in join or string at convert2annovar.pl line 2491, line 713865. Use of uninitialized value $newref in join or string at convert2annovar.pl line 2491, line 713865. Use of uninitialized value $read_depth in join or string at convert2annovar.pl line 2491, line 713865. Use of uninitialized value $newstart in join or string at convert2annovar.pl line 2491, line 713866. Use of uninitialized value $newend in join or string at convert2annovar.pl line 2491, line 713866. Use of uninitialized value $newref in join or string at convert2annovar.pl line 2491, line 713866. Use of uninitialized value $newend in join or string at convert2annovar.pl line 2491, line 836936. Use of uninitialized value $newend in join or string at convert2annovar.pl line 2491, line 844718. Use of uninitialized value $newref in join or string at convert2annovar.pl line 2491, line 844718. Use of uninitialized value $read_depth in join or string at convert2annovar.pl line 2491, line 844718. Use of uninitialized value $newstart in join or string at convert2annovar.pl line 2491, line 844719. Use of uninitialized value $newend in join or string at convert2annovar.pl line 2491, line 844719. Use of uninitialized value $newref in join or string at convert2annovar.pl line 2491, line 844719. Use of uninitialized value $read_depth in join or string at convert2annovar.pl line 2491, line 844719. Use of uninitialized value $newstart in join or string at convert2annovar.pl line 2491, line 844720. Use of uninitialized value $newend in join or string at convert2annovar.pl line 2491, line 844720. Use of uninitialized value $newref in join or string at convert2annovar.pl line 2491, line 844720. Use of uninitialized value $read_depth in join or string at convert2annovar.pl line 2491, line 844720.

Hello! Were you ablee to solve this problem, I am also experiencing it.