samtools / htsjdk

A Java API for high-throughput sequencing data (HTS) formats.
http://samtools.github.io/htsjdk/
283 stars 242 forks source link

Give VCFEncoder an option to print float values with full precision #1598

Open roland-ewald opened 2 years ago

roland-ewald commented 2 years ago

VCFEncoder.formatVCFDouble restricts the precision of floating point values to three digits after the comma:

https://github.com/samtools/htsjdk/blob/8f82871c167cca7f3f9921a251469f96dd1a2979/src/main/java/htsjdk/variant/vcf/VCFEncoder.java#L261-L279

This may not be enough for some applications (e.g. consider very small allele frequencies). Could we add an option to VCFEncoder that formats such values as %.9g (to keep full 32-bit float precision, which would nicely match the number format assumed by the VCF/BCL standard for other fields like QUAL)

I'm happy to prepare a PR, but I realize this touches some very mature parts of the codebase, so I'd like to get some feedback first.

Expected behaviour

VCFEncoder should be configurable to output annotation values with floating point precision.

Actual behaviour

VCFEncoder can't be configured in this way right now.