Closed simondrue closed 1 week ago
Thank you, Simon, for the report.
We checked again, and the latest version of Clair3 uses ID=AF,Number=A
for its VCF header, as shown in https://github.com/HKU-BAL/Clair3/blob/9b601b23464699d59a93b0f0bce40444b0dd0cf3/shared/utils.py#L285.
Would you mind letting us know which version of Clair3 you were using?
I was using 1.0.8. I've updated to 1.0.10 and the issue have been fixed
Sorry for the inconvenience! And thanks again for a great tool 🙏
Hi,
I've encountered a small bug in the formatting of the output VCF files.
In the header of the VCF it says:
##FORMAT=<ID=AF,Number=1,Type=Float,Description="Observed allele frequency in reads, for each ALT allele, in the same order as listed, or the REF allele for a RefCall">
Where
Number=1
is specified. As I understand it, this is not correct in the case of a multi allelic sites as the following:chr1 180847 . C CCCCT,CCT 10.68 PASS F GT:GQ:DP:AD:AF 1/2:10:102:14,11,24:0.1078,0.2353
where
AF
has the value0.1078,0.2353
, which is a list of multiple floats. I think the correct specification would beNumber=G
, so that is corresponds to the number of genotypes.Thanks for a great tool!