WGLab / doc-ANNOVAR

Documentation for the ANNOVAR software
http://annovar.openbioinformatics.org
218 stars 329 forks source link

Multiple errors for hg38? #224

Closed noranekonobokkusu closed 1 year ago

noranekonobokkusu commented 1 year ago

Hi,

I am trying to run Annular using hg38, but it crashes if I try to use exac03 or gnomad211_exome with errors like that NOTICE: Reading gene annotation from humandb/hg38_exac03.txt ... Error: invalid dbstrand information found in humandb/hg38_exac03.txt (dbstrand has to be + or -): <#Chr Start End Ref Alt ExAC_ALL ExAC_AFR ExAC_AMR ExAC_EAS ExAC_FIN ExAC_NFE ExAC_OTH ExAC_SAS> and for cytoBand, with error NOTICE: Reading gene annotation from humandb/hg38_cytoBand.txt ... Error: invalid record in humandb/hg38_cytoBand.txt (>=11 fields expected in cytoBand gene definition file): <chr1 0 2300000 p36.33 gneg>

It feels like it expects to see a different format for these files (which I downloaded using annotate_variation.pl).

My command and the error message is

./table_annovar.pl example/ex1.avinput humandb/ -buildver hg38 -out myanno -remove -protocol refGene,cytoBand,exac03 -operation g,g,g -nastring . -csvout -polish -xref example/gene_xref.txt
-----------------------------------------------------------------
NOTICE: Processing operation=g protocol=refGene

NOTICE: Running with system command 
NOTICE: Output files are written to myanno.refGene.variant_function, myanno.refGene.exonic_variant_function
NOTICE: Reading gene annotation from humandb/hg38_refGene.txt ... Done with 88819 transcripts (including 21511 without coding sequence annotation) for 28307 unique genes
NOTICE: Processing next batch with 21 unique variants in 21 input lines
NOTICE: Reading FASTA sequences from humandb/hg38_refGeneMrna.fa ... Done with 3 sequences
WARNING: A total of 606 sequences will be ignored due to lack of correct ORF annotation

NOTICE: Running with system command 
-----------------------------------------------------------------
NOTICE: Processing operation=g protocol=cytoBand

NOTICE: Running with system command 
NOTICE: Output files are written to myanno.cytoBand.variant_function, myanno.cytoBand.exonic_variant_function
NOTICE: Reading gene annotation from humandb/hg38_cytoBand.txt ... Error: invalid record in humandb/hg38_cytoBand.txt (>=11 fields expected in cytoBand gene definition file): 

Error running system command: 

(I originally tried it on my own dataset which crashed with the same message)

Thanks!

kaichop commented 1 year ago

cytoband is a region annotation and exac03 is a filter annotation. So your operation should be g,r,f

On Fri, Jun 23, 2023 at 5:44 PM Ksenia @.***> wrote:

Hi,

I am trying to run Annular using hg38, but it crashes if I try to use exac03 or gnomad211_exome with errors like that NOTICE: Reading gene annotation from humandb/hg38_exac03.txt ... Error: invalid dbstrand information found in humandb/hg38_exac03.txt (dbstrand has to be + or -): <#Chr Start End Ref Alt ExAC_ALL ExAC_AFR ExAC_AMR ExAC_EAS ExAC_FIN ExAC_NFE ExAC_OTH ExAC_SAS> and for cytoBand, with error NOTICE: Reading gene annotation from humandb/hg38_cytoBand.txt ... Error: invalid record in humandb/hg38_cytoBand.txt (>=11 fields expected in cytoBand gene definition file): <chr1 0 2300000 p36.33 gneg>

It feels like it expects to see a different format for these files (which I downloaded using annotate_variation.pl).

My command is ./table_annovar.pl example/ex1.avinput humandb/ -buildver hg38 -out myanno -remove -protocol refGene,cytoBand,exac03 -operation g,g,g -nastring . -csvout -polish -xref example/gene_xref.txt (I originally tried it on my own dataset which crashed with the same message)

Thanks!

— Reply to this email directly, view it on GitHub https://github.com/WGLab/doc-ANNOVAR/issues/224, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABNG3OHB6STSVXM6KR2UTSLXMYE3ZANCNFSM6AAAAAAZSB7FZY . You are receiving this because you are subscribed to this thread.Message ID: @.***>

noranekonobokkusu commented 1 year ago

Oh I see, I misread the documentation. Thank you!