Open pontikos opened 5 years ago
This has come up before, although I'm struggling to find the issue. Maybe it was over in htsjdk land.
Anyway, this is a parsing bug in GATK, not in bcftools output. Floating point numbers are a superset of integers. "70" is still a valid floating point number and C "atof" and "strtod" functions quite happily accept whole numbers.
While I guess we could change all floating point numbers to include .0 if they are whole numbers, it needlessly wastes space and isn't the correct solution.
Ok I've posted on GATK github:
https://github.com/broadinstitute/gatk/issues/5789
I agree that it seems silly that GATK falls over when a decimal point is missing for a float.
I hope htsjdk (assuming that's what GATK are using) and htslib can agree on this.
Yes, this is a silly bug in GATK and we will not address this in bcftools / htslib. As a workaround, you can "fix" the numbers to GATK's liking using this script https://github.com/samtools/bcftools/blob/develop/misc/fix-broken-GATK-Double-vs-Integer
Thanks! I also wrote a script to fix it. GATK don't want to fix it as GATK 3 is no longer maintained. If you are maybe able to point to the line of code that does this in bcftools I can fix this in my version.
On Wed, 13 Mar 2019, 09:12 Petr Danecek, notifications@github.com wrote:
Yes, this is a silly bug in GATK and we will not address this in bcftools / htslib. As a workaround, you can "fix" the numbers to GATK's liking using this script https://github.com/samtools/bcftools/blob/develop/misc/fix-broken-GATK-Double-vs-Integer
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/samtools/bcftools/issues/980#issuecomment-472339447, or mute the thread https://github.com/notifications/unsubscribe-auth/ADrG9HFMcftw085DHATaTuk5CbQyXb7Tks5vWMEXgaJpZM4brghN .
It's probably kputd in kstring.c. This uses %g to print up floats if very large or very small, or otherwise emulates the printf %g format itself. The z[-1] = 0
line MAY be responsible along with some editing to the trailing zero removal, but you'll need to experiment. Note though this is just following normal printing mechanism. Eg try printf on the command line:
jkb$ printf "%g\n" 0.170
0.17
jkb$ printf "%g\n" 1.70
1.7
jkb$ printf "%g\n" 17.0
17
"17", not "17.0"!
INFO fields of type float should have a decimal point even if the number has trailing zeroes I.e
70.0
instead of70
. Rounding to an integer breaks GATK.