jts / nanopolish

Signal-level algorithms for MinION data
MIT License
566 stars 159 forks source link

Parameter distributions of ONT and Nanopolish are not matching with each other #983

Closed kaltinel closed 2 years ago

kaltinel commented 2 years ago

Hi,

I am looking into the kmer parameter distributions, and I noticed that, the kmer distribution information ( mean of the event current, std, etc) are not the same in between Nanopolish model and ONT canonical model (https://github.com/nanoporetech/kmer_models/blob/master/r9.4_180mv_70bps_5mer_RNA/template_median69pA.model) (I am interested in R.9.4.1 pore information for RNAs)

I wonder why are they different and which one to use it as ground truth?

My example is: From Nanopolish

kmer | level_mean | level_stdv
TTGAC | 104.2 | 4.49

From ONT canonical:

kmer | level_mean | level_stdv
TTGAC | 122.772197 | 4.029736

If I am to generate data points, they fall into very different spaces: image

I would appreciate your insight. Thanks!

jts commented 2 years ago

Hi,

I set these models a number of years ago so my memory is a bit fuzzy, but I believe that the ONT model files have k-mers in the 3'->5' direction and nanopolish reverses them to be 5'->3'. I looked up the level for CAGTT and it seems to match.

Jared

kaltinel commented 2 years ago

Hi Jared, I see! They indeed match, thank you very much.