What causes differences in meth_qual distributions between samples?

There are a number of factors which can effect the probabilities output by the Remora model. These include the overall modified base context (including modified bases in close proximity; within 10 bases of one another). Additionally there may be some run to run variability contributing to the output probabilities. I would suggest that normalizing these may not be advisable. The model is fundamentally outputting a lower confidence at the calls which is likely meaningful. There may be settings where normalization of these output probabilities can be beneficial, but I would try to avoid this for most generic analyses.

We are certainly aiming to have these probabilities constrained to a more consistent distribution both with modeling and increased consistency on the platform. I hope this helps, but please post more details if you have particular downstream analyses which require that these probabilities be normalized.

nanoporetech / remora

What causes differences in meth_qual distributions between samples? #141