nanoporetech / dorado

Oxford Nanopore's Basecaller
https://nanoporetech.com/
Other
531 stars 63 forks source link

How does dorado handle 3D data #1089

Open happier21 opened 3 weeks ago

happier21 commented 3 weeks ago

In our experiment, we cross-linked chromosome fragments that may interact in three-dimensional space, basecalling with dorado and identifying methylation information. I need to split the cross-linked chromosome fragments and then compare them back to the reference genome, but how do I get the methylation information to the split fragments when I split them? Or can dorado do it together with identifying methylation information Waiting for your reply Thank you

HalfPhoton commented 3 weeks ago

This question isn't a dorado issue and is probably best asked on the Nanopore community forum.

However, if I understand your question correctly: Dorado annotates mod-basecalled reads with the ML and MM tags but doesn't provide a way to split the reads into fragments.

You'll need to use other tools for post-processing such as samtools which should preserve these tags when splitting.

Kind regards, Rich

biozzq commented 3 weeks ago

Dear @HalfPhoton

I think I have encountered a similar issue and would like to ask you for help.

The use of long-read sequencing technologies in the context of three-dimensional genomics, protein-DNA interactions, and simultaneous DNA methylation profiling has become achievable.

As you are aware, incorporating three-dimensional genomic information can lead to the presence of a significant number of chimeric reads in the sequencing results. These chimeras can typically be effectively aligned to the reference genome through split mapping, where the split positions correspond to the digest sites or added linker sequences. However, due to the inherent accuracy issues associated with ONT base calling, splitting at the read level may not always be the optimal approach. Instead, splitting based on the alignment results relative to the reference genome might yield better outcomes.

Given above context, I would like to ask whether BAM files generated from the base calling and alignment process using dorado can be directly used for quantitative methylation analysis with modkit.

Thank you for your time, and I look forward to your insights on this matter.

Best regards,

Zheng zhuqing

HalfPhoton commented 3 weeks ago

Hi @biozzq,

I would like to ask whether BAM files generated from the base calling and alignment process using dorado can be directly used for quantitative methylation analysis with modkit.

Yes - the output from dorado can be used for methylation analysis in modkit.

However, due to the inherent accuracy issues associated with ONT base calling, splitting at the read level may not always be the optimal approach. Instead, splitting based on the alignment results relative to the reference genome might yield better outcomes.

We're always working on improving basecalling accuracy and performance in Dorado to meet the needs of our users. I'll raise this use case with the team to discuss how we can better support these interesting workflows especially with regards to read splitting if it's problematic in three-dimensional genomics.

Best regards, Rich

biozzq commented 2 weeks ago

Dear @HalfPhoton

Thank you for your prompt response. I have a few more uncertainties that I would like to consult with you about. I obtained the modBAM file through the dorado process using the following command: dorado-0.7.3-linux-x64/bin/dorado basecaller dna_r10.4.1_e8.2_400bps_sup@v4.2.0 ./pod5_pass --modified-bases 6mA 5mC_5hmC --reference Hg38.fa | samtools view -bhS > output.bam. Here, I have attached a portion of the alignments in the file subset.bam.zip. Upon examining the methylation information within it, I found that the supplementary alignment records do not have MM and ML tags. For example, the alignment results for the read named "c59f6589-a8c9-4091-9e8e-3afca66085b5". Can modkit accurately assess the methylation information for those supplementary alignment target regions? subset.bam.zip

Thank you for your assistance. Best regards, Zheng zhuqing