nanoporetech / dorado

Oxford Nanopore's Basecaller
https://nanoporetech.com/
Other
495 stars 59 forks source link

Question: using 5mC_5hmC and 6mA models simultaneously or not? #738

Closed VahidJavaran closed 5 months ago

VahidJavaran commented 5 months ago

Hi, I'm interested in identifying base modifications within my datasets, specifically aiming to discern 5mC and 5hmC from 6mA modifications. I'm considering whether to apply both the 5mC_5hmC and 6mA base calling models at the same time or to run them independently. Additionally, I'm concerned about the accuracy of base calls, particularly the risk of incorrectly identifying methylated cytosines near regions of methylated adenine, or vice versa. Should I opt to run the models separately, I wonder about the feasibility of combining the resulting ModBAM files for subsequent downstream analysis. Best,

ymcki commented 5 months ago

Is there an HG002 benchmark for 6mA and 5mC? If so, then maybe I can check that for you with my A100.

VahidJavaran commented 5 months ago

Dear @ymcki, Thank you for your offer and the suggestion to use the HG002 benchmark for evaluating 6mA and 5mC detection. My project, however, involves methylation profiling in plant viruses, which presents a unique context compared to human genomic materials. I have a dataset contains a control sample, a PCR product of DNA virus, and 3 replicates of a plant sample contains this virus. I would be appreciated to guide me how I should analyze these samples. Best,

vellamike commented 5 months ago

Hi @VahidJavaran , you can run 5mC_5hmC and 6mA base calling models at the same time - they do not affect one another. They also do not affect canonical basecalls so running mod base calling has no bearing on the risk of incorrectly identifying methylated cytosines near regions of methylated adenine, or vice versa.