GoekeLab / m6anet

Detection of m6A from direct RNA-Seq data
https://m6anet.readthedocs.io/
MIT License
103 stars 19 forks source link

Inference identifying Inosine as m6A? #33

Closed ruicatxiao closed 2 years ago

ruicatxiao commented 2 years ago

Hey dev team,

I am wondering whether the built in inference model will identify Inosine signals as m6 Adenosine signals? From you professional opinion, would m6Anet be able to be trained to pick up Inosine signals? I dont think Inosine would pass the pores and generate exact Guanine like signals since 5mC or 6mA can alter the signals already versus non modified C and A.

chrishendra93 commented 2 years ago

hi @ruicatxiao , sorry for the delay in the reply

I think it will be interesting to look at these other modifications. Technically, if this modification does not cause segmentation error during nanopolish eventalign step and it shows different signal profiles than normal Guanine like signal, this should be to some extent detectable by m6Anet model trained on good enough labels. One caveat here is perhaps if the modification does not alter the current that much, then the averaging done by Nanopolish eventalign might make it more difficult for us to detect the modification and in that case maybe we need to preprocess the raw squiggle differently / use the raw squiggle directly

ruicatxiao commented 2 years ago

Thank you for taking your time to answer! I will extract sequence around predicted A modification and do a few motif enrichment. I do have deep RNA-seq to detect potential A to I editing from the same sample. Also planning to use TOMBO to check raw signal profile around those enriched motifs. Would be really nice if m6anet inference can identify more than m6A but inosine as well! My organism of interest completely lacks any type of m6A MTase. However it does have functional ADAR which does A to I editing.

chrishendra93 commented 2 years ago

hi @ruicatxiao , I think TOMBO will be a good baseline for this problem. Currently you need to make edit to m6Anet preprocessing script in order to extract beyond the DRACH motifs (currently the preprocessing step will only extract DRACH step). You can extract other motifs by changing the M6A_KMERS variable in m6anet/scripts/constants.py (it is basically a python list that contains all allowed motifs). Afterwards you probably can re-train m6Anet to recognize this but I think current code will require some adjustment in order to do that, but let me know if it works for you

Regards

Christopher Hendra