Closed apraga closed 5 months ago
Thanks for reporting, @apraga. This case is undefined in VCF 4.1/4.2. I asked for clarification in https://github.com/samtools/hts-specs/issues/760.
This behavior remains undefined, but for inputs that are VCF < 4.3, noodles will now read other header record values that start with a map prefix (<
) as a string if there is no map identifier (i.e., ID=
), e.g.,
##fileformat=VCFv4.2
##A=<a>
##B=<ID=b>
#CHROM POS ID REF ALT QUAL FILTER INFO
Header {
file_format: FileFormat { major: 4, minor: 2 },
other_records: {
Other("A"): Unstructured(["<a>"]),
Other("B"): Structured({
"b": Map { inner: Other { id_tag: Standard(Id) }, other_fields: {} },
}),
},
// ...
}
This is explicitly invalid in VCF >= 4.3 and will continue to fail, e.g.,
##fileformat=VCFv4.3
##A=<a>
##B=<ID=b>
#CHROM POS ID REF ALT QUAL FILTER INFO
Error: Custom {
kind: InvalidData,
error: InvalidRecord(InvalidValue(InvalidOtherMap(
Other("A"),
ParseError {
id: None,
kind: InvalidField(InvalidKey(UnexpectedEof)),
},
))),
}
These rules also apply to the serializer when writing a VCF header.
Thanks @zaeleus for this fix, that solves my issue !
Hi,
Thanks for you package. Some data (Clinvar in my case) has a
ID
in header##ID=<Description="ClinVar Variation ID">
Minimal working example
Parsing fails with
called
Result::unwrap()on an
Errvalue: Custom { kind: InvalidData, error: InvalidRecord(InvalidValue(InvalidOtherMap(Other("ID"), ParseError { id: None, kind: MissingId }))) }
Removing the offending header line solve the issue. I'm not sure if that should be managed by noodles but figured it would help to report it nevertheless.
Thanks,