brentp / vcfanno

annotate a VCF with other VCFs/BEDs/tabixed files
https://genomebiology.biomedcentral.com/articles/10.1186/s13059-016-0973-5
MIT License
365 stars 56 forks source link

how to add "." for variants outside intervals #98

Closed jungminchoilab closed 5 years ago

jungminchoilab commented 5 years ago

Happy new year!

I have test.bed.gz with:

chr1 500 1000 95.0

and a conf with:

[[annotation]]
file="test.bed.gz"
columns = [4]
ops=["self"]
names=["score"]

and a query with:

chr1    5000    .   G   A   10000   PASS    AC=2;AF=1;AN=2

vcfanno would add nothing but would it be possible to add "score=." like below?

chr1    5000    .   G   A   10000   PASS    AC=2;AF=1;AN=2;score=.

Thank you so much for sharing this tool. Jungmin

brentp commented 5 years ago

you could probably do this by annotating the query with itself and setting score=. for every variant. then only those that match in your test.bed.gz would be overridden. vcfanno annotates in the order given in the conf file so you can annotate with your vcf first and then with the test.bed.gz.

jungminchoilab commented 5 years ago

Thank you @brentp for the prompt response. This is a great idea. I was wondering if there is a way to annotate the query with itself without generating differently conf files. I am trying to incorporate this script into a pipeline that could be applicable to other query files without changing a config file every time I run it.

brentp commented 5 years ago

you'd have to programmatically set the query file.

I guess you could also do it with a [[postannotation]] block that uses lua to return '.' if the value is undefined and return the value otherwise.

jungminchoilab commented 5 years ago

thank you! very helpful.

Best, Jungmin