Closed lbeltrame closed 11 years ago
Luca; What LC_ALL setting triggers the issue? I'll try to reproduce here and work on a general fix. Apologies in advance, I'm unfortunately a bit ignorant of all of the locale issues so will have to feel my way around.
In data martedì 03 settembre 2013 12:09:54, Brad Chapman ha scritto:
What LC_ALL setting triggers the issue? I'll try to reproduce here and work
I run it_IT on my local session. In this case, decimal points are converted (correctly) to commas in some results (some percentages in the genotype fields of the VCF file are written as 38,4% rather than 38.4%, for example).
To fix this I have to export LC_ALL=C or unset LANG for the whole session, which is obviously not desirable.
Luca; I pushed fixes to the VarScan java run to use english/US locales so that the VCF output will use US-style decimal output. Can you let me know if this fixes the issue? I found a number of previous bugs related to this in VarScan commits so think it might be coming from there rather than GATK. Let me know if this doesn't fix it and we can revisit.
In data venerdì 6 settembre 2013 07:30:36, Brad Chapman ha scritto:
Hello Brad,
VarScan commits so think it might be coming from there rather than GATK. Let me know if this doesn't fix it and we can revisit.
I'll have a go next Monday and report back. Feel free to ping me if I don't give any response.
Sorry for not being able to test yet, I'm presenting a poster in a couple of weeks + preparing to teach a course this semester, so I'm insanely busy. I should be able to look at it next week.
No worries, just let us know when you get time. Good luck with all the preparation work.
Confirmed fixed. Closing report.
GATK's CombineVariants, at least in the configuration I used for testing VarScan, is locale-aware, which means it will use either the decimal point or the comma depending on the running locale.
This in turn can produce broken VCF files, which contain commas instead of decimal points.
As the pipeline should not require a specific locale setting,
LC_ALL
should be overridden toC
while the various processes are running.