samtools / hts-specs

Specifications of SAM/BAM and related high-throughput sequencing file formats
http://samtools.github.io/hts-specs/
655 stars 173 forks source link

Double precision fields in bcf #378

Open explodecomputer opened 5 years ago

explodecomputer commented 5 years ago

Hope this is the right place to be asking about this. I read with interest this thread from 2013 about the possibility of introducing double precision fields in bcf format.

A particular use case for this is for recording p-values from genome-wide association studies, the difference between 1e-40 and smaller is important. Storing these values as log10 is possible but leads to problems when integrating with other software which invariably expects p-values.

Is there any plan for this to be introduced?

jkbonfield commented 5 years ago

I'd recommend bringing this up on https://github.com/samtools/hts-specs/ instead as BCF is a file format rather than simply the bcftools implementation.

I can't comment on the request though as I'm not directly involved with either.

jmarshall commented 5 years ago

Issue transferred to hts-specs.

That mailing list thread covers the bases. I'm not quite as sanguine as Heng was about “C program[s] can tell whether double or float should be used for a real number” — if this were to be added and VCF didn't distinguish between float/double, the spec would want to have fairly specific rules about how implementations should choose.