iqbal-lab-org / pandora

Pan-genome inference and genotyping with long noisy or short accurate reads
MIT License
109 stars 14 forks source link

Slight issue with checking minimum genotype confidence threshold #320

Closed mbhall88 closed 1 year ago

mbhall88 commented 1 year ago

https://github.com/rmcolq/pandora/blob/565c1572f2c535db20c45062d359055d04a7659a/src/sampleinfo.cpp#L262-L263

the wording for this option on the command line is

-G,--gt-conf INT            Minimum genotype confidence (GT_CONF) required to make a call [default: 1]

but we only check if the variant's gt conf is greater than the threshold. Given we are saying it is the minimum confidence required to make a call, it should be >=

The result of this is if I want to allow calls with gt conf 0 (equal coverage on ref and alt), I can't as the variable is a uint16_t, so setting to -1 will break the universe.

The other thing that may slightly complicate this is that confidence is a double... So checking for equality will require some care. We'll probably have to cast the thresold to a double and then do some kind of float equality comparison that allows for some difference (i.e. 0.0001)

leoisl commented 1 year ago

Thanks! I can fix this right after next PR (lazy loading)! Or if you prefer, you can fix and then I merge... there might be other places with this issue in the code as well...

mbhall88 commented 1 year ago

I will sort it out today or tomorrow 👍