Closed anthakki closed 3 months ago
Duplicate tags are not allowed. I am not sure if it is explicitly stated in the VCF specification, but that was the intention.
The parsing is done in htslib, ideally it should give a warning and drop the duplicate fields. Obviously, easiest solution is to avoid producing invalid VCFs :)
Running
bcftools
(herefilter
but seems to affect other commands as well) on a VCF with duplicate GT (genotype) FORMAT fields seems to change all but the last GT value. Looks like./.
,0/0
,0/1
,1/1
get converted to0,0
,2,2
,2,4
, and4,4
, respectively. I'm not 100% sure if duplicate GT values are legal, but I would expect an error instead of invalid data. Non-GT fields don't seem to have the problem. I'm using bcftools 1.19, but this can also be reproduced in bcftools 1.12.Minimized test case follows. I would expect the payload to match that of the input.