Closed iqbal-lab closed 2 years ago
If the percentile threshold goes up with depth wouldn't that mean that this variant would be further away from the confidence cut-off?
no the number going up is the area under the curve. we're saying accept any genotype confidence in the top x% of the area under the curve. currently we accept any genotype confidence in the top 90%, and i'm saying for high coverage we could accept anything in the top 99% or something
This is likely related to #121
Closing in favour of #121
Looking at a very high coverage nanopore sample, we have found a good example where there is a very high confidence call, in agreement with phenotype, but which is below the dynamically calculated threshold.
INFO:mykrobe.cmds.amr:Confidence cutoff (using percent cutoff 90%): 12033
and the variant had confidence 11957.
The relevant bit of the JSON is
So what is going on? At this high depth, the whole genotype confidence distribution is shifted far to the right, and the default threshold which is set at keeping 90% of samples, is too strict. at this depth, almost everything is going to be good.
What's the fix? well, i guess the bloody percentile threshold we choose (ie the 90%), should go up with depth