Open ziczhang opened 2 years ago
If you've got a modified bam file from megalodon, you can use this tool to extract other modified motifs AFAIK
Hi Colin, Yes, I know modbam2bed, but the motif that modbam2bed can detect is limited to CpG, CHG, CHH. So I'm looking for the way to detect more sequence motifs without having to run Megalodon multiple times.
OK, great.
What about these (have never tried, but curious about them) ?
-m, --mod_base=BASE Modified base of interest, one of: 5mC, 5hmC, 5fC,
5caC, 5hmU, 5fU, 5caU, 6mA, 5oxoG, Xao.
Sorry, the sequence motif means more complex motifs like TAmCAG or somethings.
I would suggest outputting all context modified bases from megalodon. Calls in new contexts cannot be generated without rerunning megalodon.
From the all context calls output, motif specific calls can be extracted with the megalodon_extras modified_bases create_motif_bed
command followed by the bedtools intersect
command.
Thanks Marcus. I have read the same suggestion from you in another issue.
Your suggestion is very helpful, but in my understanding, megalodon_extras modified_bases create_motif_bed
is limited to the motif on the reference, and can it detect the motif on the variants?
For example, if such a mutation is present in the sample, can this C modification be detected?
Ref: NNNAGNNN ↓ Read: NNNCGNNN
I would suggest that this is a custom processing request which I am not sure we can support without a larger use case. You can see the core logic to implement such a request in the remora code here.
Could you specify what other output you would require in this output format. For example such output is a per-read output, so could not have fraction modified or other aggregated results since each read will have differing basecalls at reference locations.
I see where I went wrong. The modified_bases.5mC.bed is called from the sequence of reference genome, not from basecalled fastq. So, if I want to find a specific motif, your suggestion of using megalodon_extras modified_bases create_motif_bed
is correct, but if I also want to the methylation difference around SNPs, I have to use a modified reference genome to recall methylation, right?
Is it possible to use a previously basecalled fastq file and skip the basecall step when calling methylation using a different reference genome?
Hi,
I used Megalodon to call CpG with the following command
and if I want to call another motif like mCNG, can I reused the output files of CpG's with some commands?
Thanks, Zicong