WGLab / doc-ANNOVAR

Documentation for the ANNOVAR software
http://annovar.openbioinformatics.org
218 stars 332 forks source link

How to create custom `region-based` db and use it in `table_annovar.pl` command #142

Open Shicheng-Guo opened 3 years ago

Shicheng-Guo commented 3 years ago

Hi Prof. Wang,

Can you give a example how to create custom region-based db and use it in table_annovar.pl command together with ANNOVAR pre-prepared db. I have tried several different way and all of them are failed. The basic idea is to create some custom region-based bed-format db and use it together with avsnp154, nci60, dbnsfp30 etc.

table_annovar.pl demo.v1.avinput -thread 12 ~/janssen4/bin/annovar/humandb/ -buildver hg19 -csvout -out demo -remove -protocol bed,bed -operation r,r -bedfile hg38_cytoBand.txt,hg38_demo.txt -nastring . -otherinfo -polish -xref ~/janssen4/bin/annovar/humandb/gene_fullxref.txt

It looks ANNOVAR treat same format with different styles. For example, cytoBand cannot be took as bed and region (r) and must use cytoband and r.

Thanks.

Shicheng

kaichop commented 3 years ago

cytoband is a special keyword. If you want to create a region database, just create a standard BED file or generic file (five columns).

On Sun, May 23, 2021 at 12:22 AM Shicheng Guo @.***> wrote:

Hi Prof. Wang,

Can you give a example how to create custom region-based db and use it in table_annovar.pl command together with ANNOVAR pre-prepared db. I have tried several different way and all of them are failed. The basic idea is to create some custom region-based bed-format db and use it together with avsnp154, nci60, dbnsfp30 etc.

table_annovar.pl demo.v1.avinput -thread 12 ~/janssen4/bin/annovar/humandb/ -buildver hg19 -csvout -out demo -remove -protocol bed,bed -operation r,r -bedfile hg38_cytoBand.txt,hg38_demo.txt -nastring . -otherinfo -polish -xref ~/janssen4/bin/annovar/humandb/gene_fullxref.txt

It looks ANNOVAR treat same format with different styles. For example, cytoBand cannot be took as bed and region (r) and must use cytoband and r.

Thanks.

Shicheng

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/WGLab/doc-ANNOVAR/issues/142, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABNG3OD7WQSABYVO7QBDVL3TPB7KBANCNFSM45LJPLVA .

Shicheng-Guo commented 3 years ago

For custom-bed format db, which protocol should be used?

Here is the command I used, however, the output don't give me expected result.

table_annovar.pl demo.v1.avinput -thread 12 ~/janssen4/bin/annovar/humandb/ -buildver hg19 -csvout -out demo -remove -protocol cytoBand,bed -operation r,r -bedfile hg38_demo.txt -nastring . -otherinfo -polish -xref ~/janssen4/bin/annovar/humandb/gene_fullxref.txt

Here is the custom-bed db,

1       10000   10002   1:10000-10002   score=1570391677
1       10001   10003   1:10001-10003   score=1570391692
1       10002   10004   1:10002-10004   score=1570391694
1       10007   10009   1:10007-10009   score=1570391698
1       10008   10010   1:10008-10010   score=1570391702
1       10014   10016   1:10014-10016   score=1570391706
1       10018   10020   1:10018-10020   score=775809821
1       10019   10021   1:10019-10021   score=1570391708
1       10020   10022   1:10020-10022   score=1570391710
1       10025   10027   1:10025-10027   score=1570391712

Here is the output:

Chr,Start,End,Ref,Alt,cytoBand,bed,Otherinfo1
1,894644,894654,T,A,.,.,""
1,10001,10001,T,A,"1p36.33","Name=NA",""
1,10002,10002,A,C,"1p36.33","Name=NA",""
1,10003,10003,A,C,"1p36.33","Name=NA",""
1,10008,10008,A,G,"1p36.33","Name=NA",""
1,10009,10009,A,G,"1p36.33","Name=NA",""
1,10015,10015,A,G,"1p36.33","Name=NA",""
1,10019,10019,TA,T,.,.,""
1,10020,10020,A,C,"1p36.33","Name=NA",""
1,10021,10021,A,G,"1p36.33","Name=NA",""
1,10026,10026,A,C,"1p36.33","Name=NA",""
1,10027,10027,A,C,"1p36.33","Name=NA",""
1,10027,10027,A,G,"1p36.33","Name=NA",""
1,10032,10032,A,C,"1p36.33","Name=NA",""
1,10033,10033,A,G,"1p36.33","Name=NA",""
1,10039,10039,A,C,"1p36.33","Name=NA",""
1,10043,10043,T,A,"1p36.33","Name=NA",""
1,10045,10045,A,C,"1p36.33","Name=NA",""
1,10045,10045,A,G,"1p36.33","Name=NA",""
1,10051,10051,A,C,"1p36.33","Name=NA",""
1,10051,10051,A,G,"1p36.33","Name=NA",""
1,10051,10051,A,AC,"1p36.33","Name=NA",""
1,10055,10055,T,TA,"1p36.33","Name=NA",""

I'd like to upload the data I used and really hope to get some help to figure out how to do it correctly. Thanks. Shicheng hg38_demo.txt

demo.v1.avinput.txt hg38_demo.txt.idx.txt hg38_demo.txt.txt

Shicheng-Guo commented 3 years ago

Thanks the God! The first one succeeds.

table_annovar.pl demo.v1.avinput -thread 12 ~/janssen4/bin/annovar/humandb/ -buildver hg38 -csvout -out demo -remove -protocol bed -operation r -bedfile hg38_demo.txt -arg '-colsWanted all' -nastring . -otherinfo -polish -xref ~/janssen4/bin/annovar/humandb/gene_fullxref.txt

Chr,Start,End,Ref,Alt,bed,Otherinfo1
1,894644,894654,T,A,.,""
1,10001,10001,T,A,"Name=score=1570391677",""
1,10002,10002,A,C,"Name=score=1570391692,score=1570391677",""
1,10003,10003,A,C,"Name=score=1570391692,score=1570391694",""
1,10008,10008,A,G,"Name=score=1570391698",""
1,10009,10009,A,G,"Name=score=1570391702,score=1570391698",""
1,10015,10015,A,G,"Name=score=1570391706",""
1,10019,10019,TA,T,.,""
1,10020,10020,A,C,"Name=score=775809821,score=1570391708",""
1,10021,10021,A,G,"Name=score=1570391708,score=1570391710",""
1,10026,10026,A,C,"Name=score=1570391712",""
1,10027,10027,A,C,"Name=score=1570391712,score=1570391716",""
1,10027,10027,A,G,"Name=score=1570391712,score=1570391716",""
1,10032,10032,A,C,"Name=score=1570391720",""
1,10033,10033,A,G,"Name=score=1570391722,score=1570391720",""
1,10039,10039,A,C,"Name=score=978760828",""
1,10043,10043,T,A,"Name=score=1008829651",""
1,10045,10045,A,C,"Name=score=1570391729",""
1,10045,10045,A,G,"Name=score=1570391729",""
1,10051,10051,A,C,"Name=score=1052373574,score=1326880612",""
1,10051,10051,A,G,"Name=score=1052373574,score=1326880612",""