Closed philipwfowler closed 1 year ago
I think this is due to the VCF specifying that it is a 1/1
call, despite having a MIN_FRS
filter fail. When the minor populations flag is used, it also implicitly uses the ignore_filter
flag to ensure that such rows make it to a point where the minor populations are found. However, as this has a non-wild-type call if you ignore the filter fail - it results in it being reported as an actual call.
I'll do some digging - hopefully it shouldn't be too difficult to fix
Through a comparison of ~1600 samples from South Africa, focussing on the gene pncA, I've found that a row in the VCF with a
MIN_FRS
filter fail but was called as1/1
is recorded as a dominant variant/mutation (e.g. Q141P) rather than a minor variant (e.g. Q141P:0.68).Using the VCF below if I issue
I can see there are two rows for pncA, both with
MIN_FRS
filter fails. One is0/0
and the other1/1
. Then if I doI see one of these is reported as a minor allele, but the other (the 1/1) is reported as a dominant mutation. If I try without the
--minor_populations
flag then neither row is reported, as expected.Hence it looks like there is a bug when (i) the minor populations are being investigated and (ii) for variants which are called
1/1
.site.10.subj.AG01753383.lab.AG01753383.iso.1.v0.12.4.per_sample.vcf.gz minor_alleles.txt