nanoporetech / remora

Methylation/modified base calling separated from basecalling.
https://nanoporetech.com
Other
154 stars 20 forks source link

Support for Standalone 5mC and/or 5mCG in dorado Basecalling Model Version 4.3.0 #172

Closed wietingj closed 4 months ago

wietingj commented 4 months ago

Hello everyone,

as of dorado basecalling model version 4.3.0, only 5mC_5hmC or 5mCG_5hmCG modified bases are supported, not 5mC or 5mCG alone. This leads to problems in the downstream analysis, especially in the statistical evaluation of differential methylation, as 5mC and 5hmC are then combined into one count. For modkit this is addressed in its limitations (https://nanoporetech.github.io/modkit/limitations.html), but the same problem exists in e.g. NanoMethViz / DSS etc. as well.

As far as I know, there is currently no option for separate statistical evaluation of the respective modifiers from a combined 5mC_5hmC modbam file, so it would be desirable if standalone 5mC or 5mCG could also be supported in the current model versions. Or is there an option I am missing?

Thanks for your feedback.

marcus1487 commented 4 months ago

I think the modkit command modkit adjust-mods --ignore h is the command for which you are searching. Please let me know if this does not resolve this issue.

For the further question of "separate statistical evaluation of the respective modifiers from a combined 5mC_5hmC modbam file", I'm not quite sure I understand what you mean. Could you expand on this a bit further?

marcus1487 commented 4 months ago

Hopefully this has resolved your issue. If you have further questions please reopen this issue.