samtools / htsjdk

A Java API for high-throughput sequencing data (HTS) formats.
http://samtools.github.io/htsjdk/
283 stars 242 forks source link

Enforce reserved fields in the header #1553

Open nh13 opened 3 years ago

nh13 commented 3 years ago

For example, varlociraptor currently outputs the FORMAT/DP field as a per-alternate-allele set of values, whereas the spec defines it as only one value (total read depth for the sample). The current error is cryptic (doesn't tell you which key it is trying to parse). It would also be great to check the header to see if any reserved keys are defined incorrectly in terms of number and type:

Caused by: java.lang.NumberFormatException: For input string: "35,35"
    at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
    at java.lang.Integer.parseInt(Integer.java:580)
    at java.lang.Integer.parseInt(Integer.java:615)
    at htsjdk.variant.vcf.AbstractVCFCodec.createGenotypeMap(AbstractVCFCodec.java:820)
    at htsjdk.variant.vcf.AbstractVCFCodec$LazyVCFGenotypesParser.parse(AbstractVCFCodec.java:121)
    at htsjdk.variant.variantcontext.LazyGenotypesContext.decode(LazyGenotypesContext.java:158)
    at htsjdk.variant.variantcontext.LazyGenotypesContext.getGenotypes(LazyGenotypesContext.java:148)
    at htsjdk.variant.variantcontext.GenotypesContext.get(GenotypesContext.java:417)
    at htsjdk.variant.variantcontext.VariantContext.getGenotype(VariantContext.java:1102)
lbergelson commented 3 years ago

These are all good points and suggestions. That error is aggressively unhelpful.