vahidAK / NanoMethPhase

Methylation Phasing for Nanopore Sequencing
GNU General Public License v3.0
44 stars 3 forks source link

Utilizing phased BAM and phased VCF files generated by Pepper-Margin-DeepVariant as inputs #27

Closed edeuse closed 1 week ago

edeuse commented 2 weeks ago

Can I use phased BAM and phased VCF files generated by Pepper-Margin-DeepVariant as inputs for dma anlaysis? If this approach is successful, does it imply that the 'methyl_call_processor' and 'phase' steps are no longer necessary for the dma process?

vahidAK commented 2 weeks ago

Hi @edeuse , No, you are not able to do that. But, if your bam file has the methylation tags, you can use Modkit to extract phased methylation data in a bed format from your bam and then use it as input for dma module. Just make sure the modkit output is compatible with the dma module, you may need to do some file formatting before using dma module. And yes, in the above case, you do not need previous steps because you already got the phased methylation.

edeuse commented 2 weeks ago

Thank you for your response. I realize my previous question might not have been entirely clear.

I used the haplotagged BAM file generated from 'Pepper-Margin-DeepVariant (PMD)', and subsequently extracted FASTQ files from this BAM using 'samtools fastq'. These FASTQ files were then used as input for the 'f5c' tool to obtain methylation call files and methylation frequency files.

I was inquiring whether these four files—haplotagged BAM (from PMD), phased VCF (from PMD), methylation call (from f5c), and methylation frequency files (from f5c)—could be utilized as input files for the 'methyl_call_processor', 'phase', and 'dma' modules, respectively.

I have successfully verified that these modules operate correctly using the aforementioned input files.

If you still do not recommend using the BAM, VCF, methylation call, and frequency files generated through PMD->f5c, I would appreciate understanding the reasoning behind this.

Additionally, I noticed that the BED files generated by 'modkit' differ in format from the methylation call files used in the 'methyl_call_processor' module, as they do not contain log values or read ID information.

vahidAK commented 2 weeks ago

Hi @edeuse , Yes, you should be able to use the bam, phased vcf, and methylation call file (f5c format should be fine but to just double-check you can see if the columns match with what I explained here) with methyl_call_processor and phase modules. Then, you can use the results from the phase module for dma module. On the other and if I understood correctly, it seems you already have the phased (HP1 and HP2) f5c frequency data. If this is the case, then you should be able to use HP1 and HP2 frequency files for dma module and detect DMRs, just make sure to select the right columns.

edeuse commented 1 week ago

Thank you for your advice. I understand that VCF and BAM from Pepper-Margin-DeepVariant can be used with NanoMethPhase. I appreciate your guidance.