fanglab / nanodisco

nanodisco: a toolbox for discovering and exploiting multiple types of DNA methylation from individual bacteria and microbiomes using nanopore sequencing.
Other
66 stars 7 forks source link

Only one motif detected #54

Closed tongzhouxu closed 1 year ago

tongzhouxu commented 1 year ago

Hi,

Thanks for developing nanodisco. It is a great tool. We used nanodisco to look for mythlations in a Listeria strain but were only able to get one motif and the prediction score for the mythlated site is also quite low (please see attached). The coverage for the sequecing is aound 150x. I was wondering if would recommend increasing coverage to get more possible motifs.

Thanks, Tongzhou Motifs_classification_0h_nn_model.pdf

fanggang commented 1 year ago

Thank you for your interest and email, Tongzhou.

150X is fairly good from our experience. Although the score is not high (it happens sometime), the heatmap you sent looks very clean (the type and fine mapped base both clear). GKATMC is a nice palindrome, very likely an authentic motif in this strain.

Not sure which Listeria species it is, but REBASE shows that some Listeria species/strains have very few methylation motifs (some other strains have more). You do not necessarily have to find many motifs.

Hope this helps. Gang

On Wed, Nov 9, 2022 at 3:32 PM Tongzhou Xu @.***> wrote:

Hi,

Thanks for developing nanodisco. It is a great tool. We used nanodisco to look for mythlations in a Listeria strain but were only able to get one motif and the prediction score for the mythlated site is also quite low (please see attached). The coverage for the sequecing is aound 150x. I was wondering if would recommend increasing coverage to get more possible motifs.

Thanks, Tongzhou Motifs_classification_0h_nn_model.pdf

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you are subscribed to this thread.Message ID: @.***>

tongzhouxu commented 1 year ago

Hi Dr. Fang @fanggang ,

Thank you so much for your quick reply! The strain we sequenced is Listeria monocytogenes H7550-cds and I found only one motif GTATCC on REBASE which seems concordant with our results. I would also like to know if I can get all the motif methylation sites with nanodisco. I see there is a motifs_22bp_top_2000_peaks.fasta file with genome locations and I wonder if I can get all the peaks for one motif or I need to explore the current difference file.

Thanks, Tongzhou

fanggang commented 1 year ago

Tongzhou, great knowing that your results look consistent with expected entries in REBASE. Alan has more experience about your question on fasta. At the high level, we do not recommend interpreting individual site level methylation from nanopore data yet, at least not from nanodisco, as we explained in Q7 in FAQ: https://nanodisco.readthedocs.io/en/latest/faq.html Hope this helps, Gang

On Fri, Nov 11, 2022 at 10:10 AM Tongzhou Xu @.***> wrote:

Hi Dr. Fang @fanggang ,

Thank you so much for your quick reply! The strain we sequenced is Listeria monocytogenes H7550-cds and I found only one motif GTATCC on REBASE which seems concordant with our results. I would also like to know if I can get all the motif methylation sites with nanodisco. I see there is a motifs_22bp_top_2000_peaks.fasta file with genome locations and I wonder if I can get all the peaks for one motif or I need to explore the current difference file.

Thanks, Tongzhou

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.***>

tongzhouxu commented 1 year ago

Thank you Dr. Fang so much for your explanation!