brentp / vcfanno

annotate a VCF with other VCFs/BEDs/tabixed files
https://genomebiology.biomedcentral.com/articles/10.1186/s13059-016-0973-5
MIT License
357 stars 55 forks source link

Multiple Files, Same Fields #117

Closed violetbrina closed 4 years ago

violetbrina commented 4 years ago

It would be good to have a way to add multiple files to the "file" option in the config. For dbNSFP the vcfs are split up by chromosome but have the exact same column mapping.

If you could please add the option to make file an array like so:

[[annotation]] file="clinvar.norm.vcf.gz" fields=["FATHMM_score", "MetaSVM_score", "PROVEAN_score"] names=["FATHMM_score", "MetaSVM_score", "PROVEAN_score"] ops=["self", "self", "self"]

Thanks

brentp commented 4 years ago

For your dbNSFP, you can just use, e.g.:

[[annotation]]
file="dbnsfp.chr1.txt.gz"
fields=["FATHMM_score", "MetaSVM_score", "PROVEAN_score"]
names=["FATHMM_score", "MetaSVM_score", "PROVEAN_score"]
ops=["self", "self", "self"]

[[annotation]]
file="dbnsfp.chr2.txt.gz"
fields=["FATHMM_score", "MetaSVM_score", "PROVEAN_score"]
names=["FATHMM_score", "MetaSVM_score", "PROVEAN_score"]
ops=["self", "self", "self"]

....

and so on for each chrom. Does that address your problem?

violetbrina commented 4 years ago

Yeah that would work, and that's what I'm currently doing. I guess this is more a feature request for convenience more than anything else. Rather than copying and having to edit the mapping for each chromosome every time I make a change.

brentp commented 4 years ago

this is working as intended.