cultivarium / MicrobeMod

A toolkit for exploring prokaryotic methylation and base modifications in nanopore sequencing
MIT License
34 stars 1 forks source link

--percent_cutoff_streme parameter changes _methylated_sites.tsv output #23

Open winterlich opened 4 months ago

winterlich commented 4 months ago

I encountered a potential problem in the call_methylation step. I have compared the methylation calls of a dataset using different parameters. When I use the default parameters, no 4mC modifications are found in the _methylated_sites.tsv. If I add the parameter --percent_cutoff_streme 0.66, 24 4mC modifications are added to the _methylated_sites.tsv of the same dataset. If I understand the documentation and this parameter correctly, it should only change the STREME motif call and not the output of the _methylated_sites.tsv itself, am I right?

If you want to analyse my data, please contact me directly.

alexcritschristoph commented 4 months ago

Is this happening for just 4mC sites, or all methylation types?

winterlich commented 4 months ago

Just for 4mC.

The parameter does not change the number of 5mC and 6mA callings...

alexcritschristoph commented 3 months ago

I can't figure out why this might be occurring, but I think this is some kind of edge case, perhaps when there are m4C calls at the same position as another type of call like 5mC (due to base pair error).

Fortunately, just 24 m4C sites in the methylated_sites.tsv shouldn't change interpretation or understanding from results, I think.

Give modkit pileup a try and have a look at what the 4mC sites above look like in the resulting BED file. I suspect that would make it clear what is going on.

Curious if anyone encounters a similar issue with a more substantial discrepancy.