nanoporetech / dorado

Oxford Nanopore's Basecaller
https://nanoporetech.com/
Other
446 stars 54 forks source link

very basic question, not an issue #730

Closed mocherry closed 3 months ago

mocherry commented 3 months ago

Dear all,

I am kind of new to analyzing epigenetics data so you have to excuse my naive question: When I use Dorado for basecalling m6A-RNA modifications from direct RNA-sequencing and the resulting BAM-file has the MM tag which can be extracted with modkit, why do I need nanopolish and e.g. m6anet or any of the other tools to basically get the same information? Is the "mod_qual | probability of the base modification" column from "modkit extract" not reliable enough? What tools would I have to use to determine and visualize results from such an analysis?

Thanks and best, Matthias

marcus1487 commented 3 months ago

We have tried to supply modified base calls in standard community formats and with as little extra user effort as possible. We hope that this means that users should not require the use of third party tools for modified bases (for which we have a released model either in Dorado or Rerio for research models). There may be some functionality for research exploration found in third party tools and we encourage users to explore getting the most out of their nanopore data!

The mod_qual column from modkit extract should certainly be reliable. We have described the accuracy of the various released models in previous "Data for Breakfast" or "Data After Dark" sessions at London Calling and the Nanopore Community Meetings. You can see the latest accuracy metrics in the latest NCM presentation. Is there a particular reason which you suspect that the mod_qual values are not reliable enough?

For downstream tools, we provide modkit to process the modbam files and perform common downstream analysis tasks. We aim to make this tool the bedtools/samtools for modified bases. For visualization I would mostly recommend genome browsers which support direct visualization of modbam files as well as downstream BED-based formats such as IGV and JBrowse. If there are particular common visualizations that would be useful we can look into supporting these, but have generally found that these are quite customized to specific applications.

mocherry commented 3 months ago

Dear Marcus,thanks for your comments.Now for what I am missing in the whole Nanopore community is step by step protocols on how to go about with an analysis using real world examples. I have come to appreciate that in the R- community were transcrptomics or scRNA-seq are demonstrated and explained with a multitude of actually working examples from start to finish. The modkit documantation for example is far too technical and complicated. I would like to see a complete, easy to follow example, in our case for m6A- RNA modification analysis (including comparison between different treatments).Sorry for being a little bit negative, but not everybody as as deep into it as you are.Best,MatthiasVon meinem/meiner Galaxy gesendet -------- Ursprüngliche Nachricht --------Von: Marcus Stoiber @.> Datum: 05.04.24 20:11 (GMT+01:00) An: nanoporetech/dorado @.> Cc: mocherry @.>, Author @.> Betreff: [EXTERN] Re: [nanoporetech/dorado] very basic question, not an issue (Issue #730) We have tried to supply modified base calls in standard community formats and with as little extra user effort as possible. We hope that this means that users should not require the use of third party tools for modified bases (for which we have a released model either in Dorado or Rerio for research models). There may be some functionality for research exploration found in third party tools and we encourage users to explore getting the most out of their nanopore data! The mod_qual column from modkit extract should certainly be reliable. We have described the accuracy of the various released models in previous "Data for Breakfast" or "Data After Dark" sessions at London Calling and the Nanopore Community Meetings. You can see the latest accuracy metrics in the latest NCM presentation. Is there a particular reason which you suspect that the mod_qual values are not reliable enough? For downstream tools, we provide modkit to process the modbam files and perform common downstream analysis tasks. We aim to make this tool the bedtools/samtools for modified bases. For visualization I would mostly recommend genome browsers which support direct visualization of modbam files as well as downstream BED-based formats such as IGV and JBrowse. If there are particular common visualizations that would be useful we can look into supporting these, but have generally found that these are quite customized to specific applications.

—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you authored the thread.Message ID: @.***>

HalfPhoton commented 3 months ago

@mocherry, I'll share your comments internally and we'll continue to work on improving the documentation of our software. Kind regards, Rich