Closed Shians closed 3 years ago
This is per read, so I would expect the accuracy to be lower, although this really doesn't look good. Try aggregating over multiple reads or one of the newer models trained with data basecalled using guppy. Alternatively, if you have data from unamplified E. coli DNA, it may be worth training a new model that better represents your species-flowcell-basecaller combination.
I see from the REAMDE that the columns are "chromosome, read name, genomic position, position k-mer context, features, strand, label, and probability of methylation"
Here is some output from some PCR E. Coli samples which should have no methylation.
On all 6mA contexts, is mCaller reporting a high probability of methylation?