WGLab / doc-ANNOVAR

Documentation for the ANNOVAR software
http://annovar.openbioinformatics.org
234 stars 359 forks source link

revel format ? #232

Closed gaom001 closed 1 year ago

gaom001 commented 1 year ago

I would like to annotate variants with revel (which was download directly from the REVEL website), however, the format is not correct I beleive, please see bleow error:

NOTICE: Reading gene annotation from /humandb/hg19_revel.txt ... Error: invalid record in /humandb/hg19_revel.txt (>=11 fields expected in revel gene definition file):

Any suggestions?

Chr Start End Ref Alt REVEL 1 35142 35142 G A 0.027 1 35142 35142 G C 0.035 1 35142 35142 G T 0.043 1 35143 35143 T A 0.018 1 35143 35143 T C 0.034 1 35143 35143 T G 0.039 1 35144 35144 A C 0.012 1 35145 35145 C A 0.023 1 35145 35145 C G 0.029 1 35145 35145 C T 0.016 1 35146 35146 A C 0.031 1 35146 35146 A G 0.016 1 35146 35146 A T 0.025 1 35147 35147 T A 0.004 1 35147 35147 T G 0.004 1 35148 35148 A G 0.010 1 35149 35149 A C 0.029 1 35149 35149 A T 0.022 1 35150 35150 T A 0.038 1 35150 35150 T G 0.055 1 35151 35151 C A 0.037 1 35151 35151 C G 0.036

kaichop commented 1 year ago

You can use -filter as the annotation type. Right now you are treating revel as a gene-annotation. If you show the command line then I can advise more. I assume the file format that you listed below is the hg19_revel.txt file. It is in a generic filter annotation file format.

On Mon, Oct 16, 2023 at 8:31 PM MING @.***> wrote:

I would like to annotate variants with revel (which was download directly from the REVEL website), however, the format is not correct I beleive, please see bleow error:

NOTICE: Reading gene annotation from /humandb/hg19_revel.txt ... Error: invalid record in /humandb/hg19_revel.txt (>=11 fields expected in revel gene definition file):

Any suggestions?

Chr Start End Ref Alt REVEL 1 35142 35142 G A 0.027 1 35142 35142 G C 0.035 1 35142 35142 G T 0.043 1 35143 35143 T A 0.018 1 35143 35143 T C 0.034 1 35143 35143 T G 0.039 1 35144 35144 A C 0.012 1 35145 35145 C A 0.023 1 35145 35145 C G 0.029 1 35145 35145 C T 0.016 1 35146 35146 A C 0.031 1 35146 35146 A G 0.016 1 35146 35146 A T 0.025 1 35147 35147 T A 0.004 1 35147 35147 T G 0.004 1 35148 35148 A G 0.010 1 35149 35149 A C 0.029 1 35149 35149 A T 0.022 1 35150 35150 T A 0.038 1 35150 35150 T G 0.055 1 35151 35151 C A 0.037 1 35151 35151 C G 0.036

— Reply to this email directly, view it on GitHub https://github.com/WGLab/doc-ANNOVAR/issues/232, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABNG3OCLVNNI2DGUJW76DUDX7XGUJAVCNFSM6AAAAAA6C7QYEOVHI2DSMVQWIX3LMV43ASLTON2WKOZRHE2DMMZRGI3TKOI . You are receiving this because you are subscribed to this thread.Message ID: @.***>

gaom001 commented 1 year ago

Dr. Wang, thanks, yes, I change to generic format, does this file include the header: chr , start, end, ref, alt, revel? Below is my running command:

/work/isabl/home/gaom4/annovar/table_annovar.pl /work/isabl/home/gaom4/rare_variants/merge_423_dp10_norm_subset_norm_pass_0.vcf /work/isabl/home/gaom4/annovar/humandb/ -buildver hg19 -out /work/isabl/home/gaom4/rare_variants/mg -remove -protocol refGene,revel,avsnp150,dbnsfp42c,dbscsnv11,clinvar_20221231,intervar_20180118,gnomad_genome -operation g,f,f,f,f,f,f,f -arg '-hgvs',,,,,,, -nastring . -vcfinput -polish --onetranscript

I run this command, but met another issue: please see below,

NOTICE: Processing operation=f protocol=revel

NOTICE: Running system command <annotate_variation.pl -filter -dbtype revel -buildver hg19 -outfile /work/isabl/home/gaom4/rare_variants/mg /work/isabl/home/gaom4/rare_variants/mg.avinput /work/isabl/home/gaom4/annovar/humandb/> NOTICE: the --dbtype revel is assumed to be in generic ANNOVAR database format NOTICE: Output file with variants matching filtering criteria is written to /work/isabl/home/gaom4/rare_variants/mg.hg19_revel_dropped, and output file with other variants is written to /work/isabl/home/gaom4/rare_variants/mg.hg19_revel_filtered NOTICE: Processing next batch with 14949 unique variants in 15957 input lines NOTICE: Scanning filter database /work/isabl/home/gaom4/annovar/humandb/hg19_revel.txt...Argument "." isn't numeric in numeric eq (==) at /work/isabl/home/gaom4/annovar/annotate_variation.pl line 2603, line 2900. Argument "." isn't numeric in numeric eq (==) at /work/isabl/home/gaom4/annovar/annotate_variation.pl line 2603, line 2901. Argument "." isn't numeric in numeric eq (==) at /work/isabl/home/gaom4/annovar/annotate_variation.pl line 2603, line 2902. Argument "." isn't numeric in numeric eq (==) at /work/isabl/home/gaom4/annovar/annotate_variation.pl line 2603, line 2903. Argument "." isn't numeric in numeric eq (==) at /work/isabl/home/gaom4/annovar/annotate_variation.pl line 2603, line 2904. Argument "." isn't numeric in numeric eq (==) at /work/isabl/home/gaom4/annovar/annotate_variation.pl line 2603, line 2905. Argument "." isn't numeric in numeric eq (==) at /work/isabl/home/gaom4/annovar/annotate_variation.pl line 2603, line 2906. Argument "." isn't numeric in numeric eq (==) at /work/isabl/home/gaom4/annovar/annotate_variation.pl line 2603, line 2907. Argument "." isn't numeric in numeric eq (==) at /work/isabl/home/gaom4/annovar/annotate_variation.pl line 2603, line 2908. Argument "." isn't numeric in numeric eq (==) at /work/isabl/home/gaom4/annovar/annotate_variation.pl line 2603, line 2909. Argument "." isn't numeric in numeric eq (==) at /work/isabl/home/gaom4/annovar/annotate_variation.pl line 2603, line 2910. Argument "." isn't numeric in numeric eq (==) at /work/isabl/home/gaom4/annovar/annotate_variation.pl line 2603, line 2911. Argument "." isn't numeric in numeric eq (==) at /work/isabl/home/gaom4/annovar/annotate_variation.pl line 2603, line 2912. ........

kaichop commented 1 year ago

The first line should start with # so it is clear that it is a comment line It seems that there are issues with some of the lines in the database file, though you did specify '-nastring .' so the presence of '.' should not be an issue. You may want to check to make sure that every line has the same format (for example, it is possible that some lines have "." in the second or third field, which should be start and end position. Also what is in line 2603 (if you can print the previous 5 and next 5 lines that would be great)? My version differs from yours so I cannot tell.

On Mon, Oct 16, 2023 at 9:19 PM MING @.***> wrote:

Dr. Wang, thanks, yes, I change to generic format, does this file include the header: chr , start, end, ref, alt, revel? Below is my running command:

/work/isabl/home/gaom4/annovar/table_annovar.pl /work/isabl/home/gaom4/rare_variants/merge_423_dp10_norm_subset_norm_pass_0.vcf /work/isabl/home/gaom4/annovar/humandb/ -buildver hg19 -out /work/isabl/home/gaom4/rare_variants/mg -remove -protocol refGene,revel,avsnp150,dbnsfp42c,dbscsnv11,clinvar_20221231,intervar_20180118,gnomad_genome -operation g,f,f,f,f,f,f,f -arg '-hgvs',,,,,,, -nastring . -vcfinput -polish --onetranscript

I run this command, but met another issue: please see below,

NOTICE: Processing operation=f protocol=revel

NOTICE: Running system command <annotate_variation.pl -filter -dbtype revel -buildver hg19 -outfile /work/isabl/home/gaom4/rare_variants/mg /work/isabl/home/gaom4/rare_variants/mg.avinput /work/isabl/home/gaom4/annovar/humandb/> NOTICE: the --dbtype revel is assumed to be in generic ANNOVAR database format NOTICE: Output file with variants matching filtering criteria is written to /work/isabl/home/gaom4/rare_variants/mg.hg19_revel_dropped, and output file with other variants is written to /work/isabl/home/gaom4/rare_variants/mg.hg19_revel_filtered NOTICE: Processing next batch with 14949 unique variants in 15957 input lines NOTICE: Scanning filter database /work/isabl/home/gaom4/annovar/humandb/hg19_revel.txt...Argument "." isn't numeric in numeric eq (==) at /work/isabl/home/gaom4/annovar/ annotate_variation.pl line 2603, line 2900. Argument "." isn't numeric in numeric eq (==) at /work/isabl/home/gaom4/annovar/annotate_variation.pl line 2603, line 2901. Argument "." isn't numeric in numeric eq (==) at /work/isabl/home/gaom4/annovar/annotate_variation.pl line 2603, line 2902. Argument "." isn't numeric in numeric eq (==) at /work/isabl/home/gaom4/annovar/annotate_variation.pl line 2603, line 2903. Argument "." isn't numeric in numeric eq (==) at /work/isabl/home/gaom4/annovar/annotate_variation.pl line 2603, line 2904. Argument "." isn't numeric in numeric eq (==) at /work/isabl/home/gaom4/annovar/annotate_variation.pl line 2603, line 2905. Argument "." isn't numeric in numeric eq (==) at /work/isabl/home/gaom4/annovar/annotate_variation.pl line 2603, line 2906. Argument "." isn't numeric in numeric eq (==) at /work/isabl/home/gaom4/annovar/annotate_variation.pl line 2603, line 2907. Argument "." isn't numeric in numeric eq (==) at /work/isabl/home/gaom4/annovar/annotate_variation.pl line 2603, line 2908. Argument "." isn't numeric in numeric eq (==) at /work/isabl/home/gaom4/annovar/annotate_variation.pl line 2603, line 2909. Argument "." isn't numeric in numeric eq (==) at /work/isabl/home/gaom4/annovar/annotate_variation.pl line 2603, line 2910. Argument "." isn't numeric in numeric eq (==) at /work/isabl/home/gaom4/annovar/annotate_variation.pl line 2603, line 2911. Argument "." isn't numeric in numeric eq (==) at /work/isabl/home/gaom4/annovar/annotate_variation.pl line 2603, line 2912. ........

— Reply to this email directly, view it on GitHub https://github.com/WGLab/doc-ANNOVAR/issues/232#issuecomment-1765504700, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABNG3OBSNFJVEN6362L6NWDX7XMLJAVCNFSM6AAAAAA6C7QYEOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTONRVGUYDINZQGA . You are receiving this because you commented.Message ID: @.***>

gaom001 commented 1 year ago

2597 $chr =~ s/^chr//i; #changed 20120712; when genericdb contains "chr", all variants will be filtered. 2598 if ($chromosome) { 2599 $valichr{$chr} or next; 2600 } 2601 defined $obs or die "Error: invalid record found in DB file $dbfile (at least 5 fields expected for 'generic' dbtype): <$_>\n"; 2602
2603 if ($start == $end and $ref eq '-') { #insertion 2604 $obs = "0$obs"; 2605 } elsif ($obs eq '-') { #deletion 2606 $obs = $end-$start+1; 2607 } elsif ($start != $end or $start==$end and length($obs)>1) { #block substitution fixed 20130430 2608 $obs = ($end-$start+1) . $obs; 2609 } 2610 if (defined $score and defined $score_threshold and $score=~/\d{1,}.{0,1}\d{0,}/) { 2611 if ($reverse) { 2612 $score > $score_threshold and next; 2613 } else { 2614 $score < $score_threshold and next; 2615 } 2616 }

kaichop commented 1 year ago

So in some of the lines, your start and end are not integers, but ".", which causes the error message.

On Mon, Oct 16, 2023 at 11:00 PM MING @.***> wrote:

2597 $chr =~ s/^chr//i; #changed 20120712; when genericdb contains "chr", all variants will be filtered. 2598 if ($chromosome) { 2599 $valichr{$chr} or next; 2600 } 2601 defined $obs or die "Error: invalid record found in DB file $dbfile (at least 5 fields expected for 'generic' dbtype): <$_>\n"; 2602 2603 if ($start == $end and $ref eq '-') { #insertion 2604 $obs = "0$obs"; 2605 } elsif ($obs eq '-') { #deletion 2606 $obs = $end-$start+1; 2607 } elsif ($start != $end or $start==$end and length($obs)>1) { #block substitution fixed 20130430 2608 $obs = ($end-$start+1) . $obs; 2609 } 2610 if (defined $score and defined $score_threshold and $score=~/\d{1,}.{0,1}\d{0,}/) { 2611 if ($reverse) { 2612 $score > $score_threshold and next; 2613 } else { 2614 $score < $score_threshold and next; 2615 } 2616 }

— Reply to this email directly, view it on GitHub https://github.com/WGLab/doc-ANNOVAR/issues/232#issuecomment-1765584570, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABNG3ODFZ2JQWYGSS5AUSVLX7XYDFAVCNFSM6AAAAAA6C7QYEOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTONRVGU4DINJXGA . You are receiving this because you commented.Message ID: @.***>

gaom001 commented 1 year ago

You are right,

2895 1 70003 70003 T G 0.035 2896 1 70004 70004 T A 0.029 2897 1 70004 70004 T C 0.023 2898 1 70004 70004 T G 0.028 2899 1 70005 70005 T A 0.079 2900 1 70005 70005 T G 0.079 2901 1 367659 . A C 0.254 2902 1 367659 . A G 0.220 2903 1 367659 . A T 0.237 2904 1 367660 . T A 0.228 2905 1 367660 . T C 0.222 2906 1 367660 . T G 0.254 2907 1 367661 . G A 0.211 2908 1 367661 . G C 0.211 2909 1 367661 . G T 0.218 2910 1 367662 . G A 0.045 2911 1 367662 . G C 0.035 2912 1 367662 . G T 0.049 2913 1 367663 . A C 0.016 2914 1 367663 . A G 0.024 2915 1 367663 . A T 0.024 2916 1 367664 . T A 0.028 2917 1 367664 . T G 0.028 2918 1 367665 . G A 0.058 2919 1 367665 . G C 0.058 2920 1 367666 . G A 0.046 2921 1 367666 . G C 0.038

I am not sure why this happen, let me doubel check my dowmload file first. Thanks!

gaom001 commented 1 year ago

Problem solved. Thank you, Dr. Wang. I am going to close this.