brentp / vcfanno

annotate a VCF with other VCFs/BEDs/tabixed files
https://genomebiology.biomedcentral.com/articles/10.1186/s13059-016-0973-5
MIT License
357 stars 55 forks source link

breaking down CADD scores file into individual chromosomes #131

Open samkh918 opened 3 years ago

samkh918 commented 3 years ago

Hello, I have broken down the human CADD scores file into individual chromosomes and included all the resulting 24 files in the config file to speed up the annotation process, but it doesn't seem to be the correct way as the log file shows many lines that seem to indicate each variant still goes through all the files:

bix.go:251: chromosome chr1 not found in ...CADD_chr2.vcf.gz
bix.go:251: chromosome chr1 not found in ...CADD_chr3.vcf.gz
bix.go:251: chromosome chr1 not found in ...CADD_chr4.vcf.gz
...

Is it possible to limit the annotation of chr1 variants to CADD_chr1.vcf.gz or would I have to also break down my VCF file into individual chromosomes before attempting the vcfanno annotation with different config files? Thanks for your help.

brentp commented 3 years ago

Hi, vcfanno has to query every file for every variant. Those logs are just warnings but it's probably faster to have all chroms in one file.

samkh918 commented 3 years ago

Thanks for your help, I also have one other question and sorry if it's been mentioned somewhere else. I was wondering if it's possible to use vcfanno to change the names of previously annotated fields of VCF file or is assigning names only possible at the time of new annotations? For example my VCF file is already annotated with annovar and I have a "ExAC_ALL" field, I now use vcfanno for other annotation files but I also want to change that previous annovar "ExAC_ALL" to "annovar_ExAC_ALL". I gave it a try with [postannotation] but it wasn't successful:

[[postannotation]]
field=["ExAC_ALL"]
name=["annovar_ExAC_ALL"]
op=["self"]
type="String"

I get the following message after run:

vcfanno version 0.3.2 [built with go1.12.1]

see: https://github.com/brentp/vcfanno
=============================================
panic: toml: cannot load TOML value of type []interface {} into a Go string

goroutine 1 [running]:
main.main()
    /home/brentp/go/src/github.com/brentp/vcfanno/vcfanno.go:85 +0x192e
brentp commented 3 years ago

hi, see this section on postannotation. you need e.g.:

[[postannotation]]
fields=["ExAC_ALL"]
name="annovar_ExAC_ALL"
op="self"
type="String"

you can also use delete in postannotationso you can deleteExAC_ALL` after the rename (postannotation blocks are executed in the order given in the toml file).

samkh918 commented 3 years ago

Great thanks a lot!