brentp / vcfanno

annotate a VCF with other VCFs/BEDs/tabixed files
https://genomebiology.biomedcentral.com/articles/10.1186/s13059-016-0973-5
MIT License
357 stars 55 forks source link

Change header ##INFO description to "Number=A"? #69

Closed garrettjstevens closed 7 years ago

garrettjstevens commented 7 years ago

When I annotate a VCF using another VCF as a source, for lines where there are multiple ALT alleles I get comma-separated values for each INFO ID, e.g. using CADD:

#CHROM  POS ID  REF ALT QUAL    FILTER  INFO    FORMAT  SAMPLE1
1   16388580    rs3978467   C   A,T 350.2   .   cadd_phred=6,6.3;cadd_raw=0.3,0.4;  GT:AD:DP:GQ:PL  0/1:13,13,0:26:99:281,0,494,354,536,1067

This is exactly what I want. In the header tag that vcfanno adds, however, it has this:

##INFO=<ID=cadd_phred,Number=1,Type=Float,Description="phred-scaled cadd score (from ...)">
##INFO=<ID=cadd_raw,Number=1,Type=Float,Description="raw cadd score (from ...)">

It seems like this should be "Number=A", according to the VCF specs:

If the field has one value per alternate allele then this value should be ‘A’.

I'm guessing it might just be copying this value from the source VCF (the source has "Number=1", probably because it expects all variants with multiple ALT alleles to be split into single lines). Is there a way to get vcfanno to change this value to "Number=A", since that's what it is in the newly annotated VCF?

Thanks

brentp commented 7 years ago

This is related to #68. I'll have a look at these next week.

brentp commented 7 years ago

can you give: https://github.com/brentp/vcfanno/releases/tag/v0.2.7-beta a try and let me know if it resolves your issue.

garrettjstevens commented 7 years ago

I've been watching the discussion on #68, and it looks like I'm having the same problem (header is now Number=A, but it's now missing the annotation for the second ALT). I also got some warnings in the STDERR, all like:

api.go:216: WARNING: got single value %!s(float64=15.300000190734863) for value with Number=A

See the attached tar.gz for an full, minimal working example: test.tar.gz

I agree this is a tricky problem, but I am just fine with only the first case you describe in #68 being supported.

Thanks

brentp commented 7 years ago

that warning is from the last version of vcfanno, not the beta, right?

brentp commented 7 years ago

let's continue this in #68 to keep it centralized.