samtools / bcftools

This is the official development repository for BCFtools. See installation instructions and other documentation here http://samtools.github.io/bcftools/howtos/install.html
http://samtools.github.io/bcftools/
Other
678 stars 240 forks source link

validator mistakenly rejects decimals in FORMAT/QR #2257

Closed oliverdrechsel closed 3 months ago

oliverdrechsel commented 3 months ago

bcftools view fails with one of my vcf files generated by Freebayes in --gvcf mode.

The error points to lines like the example line

[E::vcf_parse_format_fill5] Invalid character '.' in 'QR' FORMAT field at NC_003277.2:1238

NC_003277.2     1238    .       A       <*>     0       .       DP=178;END=1407;MIN_DP=0        GQ:DP:MIN_DP:QR:RO:QA:AO        1.00144e+06:178:0:1.00307e+06:178:1628:0

I assume the validator tries to reject empty fields in QR that are given as '.', but here the field contains a decimal '1.00307e+06'. Removing the respective line from the vcf file makes the error message disappear, but plenty of others are still encountered.

Is there a good way how to make the validator accept decimals?

Thanks a lot in advance.

pd3 commented 3 months ago

How is the tag defined in the header, is it Type=Float or something else?

oliverdrechsel commented 3 months ago

Thanks for the immediate response!

##FORMAT=<ID=QR,Number=1,Type=Integer,Description="Sum of quality of the reference observations">

I see that bcftools is even smarter than i knew. Is it correct that Freebayes (or me) needs to put Float instead of Integer in the header line?

pd3 commented 3 months ago

Yes, it attempts to parse an integer and fails. You can easily check with the help of bcftools reheader

   # Write out the header to be modified
   bcftools view -h old.bcf > header.txt

   # Edit the header using your favorite text editor
   vi header.txt

   # Reheader the file
   bcftools reheader -h header.txt -o new.bcf old.bcf