Closed tedil closed 2 months ago
Great observation. This occurs because all BCF values are tagged with physical types, which is how the decoder determined field value types. Info field and samples values decoders now use the header type definitions to resolve the logical type. Note that record buffers from different formats can only be decoded into the same types iff the type definitions in the header are available, which is not guaranteed in VCF. Thanks for reporting!
noodles 0.75.0/noodles-bcf 0.55.0 now uses header type definitions when decoding info and samples series values. This is stricter than the previous approach, so let me know if anything surprising happens. Thanks!
noodles 0.74.0
Given the following example VCF file:
and its corresponding bcf conversion (
bcftools view example.vcf -Ob > example.bcf
) the following snippet fails:The reason is that the
RecordBuf
returned by the bcf reader uses single values (whenNumber=A
happens to beNumber=1
), while theRecordBuf
returned by the vcf reader (correctly) uses arrays (with just a single value)