nanoporetech / modkit

A bioinformatics tool for working with modified bases
https://nanoporetech.com/
Other
139 stars 8 forks source link

Inferring Hypermethylated and Hypomethylated Bases from ModKit DMR Pair Results #283

Open Bhavesh-Tiwarekar opened 1 week ago

Bhavesh-Tiwarekar commented 1 week ago

Hi @ArtRand,

I hope this message finds you well. I have been using the ModKit tool for differential methylation analysis, specifically the dmr pair command. In my analysis, I compared a control sample (using -a) with a treatment sample (using -b) and obtained the following results:

dmr_query

I am particularly interested in understanding following points:

1) which columns to consider in order to infer hypermethylated and hypomethylated bases from the tabular output.

2) Effect Size: A positive effect size suggests hypermethylation in the treatment sample compared to the control, while a negative effect size indicates hypomethylation?

3) MAP-based P-value: <0.01 p-value entries can be considered as statistically significant and reliable?

4) Is it correct to consider effect size value of <-0.25 as hypomethylated entries with 25 percent read level methylation difference and >+0.25 as hypermethylated entries with 25 percent read level methylation difference between control and treatment sample?

Thank you for your assistance!

ArtRand commented 4 days ago

Hello @Bhavesh-Tiwarekar

which columns to consider in order to infer hypermethylated and hypomethylated bases from the tabular output.

I would use the effect_size column.

Effect Size: A positive effect size suggests hypermethylation in the treatment sample compared to the control, while a negative effect size indicates hypomethylation?

Correct.

MAP-based P-value: <0.01 p-value entries can be considered as statistically significant and reliable?

I would visualize the distribution of MAP-based p-values to determine a threshold, but 0.01 sounds reasonable. This is the default value in the segmentation routine.

Is it correct to consider effect size value of <-0.25 as hypomethylated entries with 25 percent read level methylation difference and >+0.25 as hypermethylated entries with 25 percent read level methylation difference between control and treatment sample?

This is generally the correct interpretation. I'd just be careful since you could have low coverage and see this effect size, the MAP-based p-value is meant to qualify the effect size based on coverage, you can find the docs.