smithlabcode / methpipe

A pipeline for analyzing DNA methylation data from bisulfite sequencing.
http://smithlabresearch.org/methpipe
67 stars 27 forks source link

How are you defining a hypomethylated region numerically? #175

Closed vagan21 closed 3 years ago

vagan21 commented 3 years ago

Hello, In your documentation you say that you call hypomethylated regions (HMRs) using the hidden markov model with the hmr program. Nowhere do you say what is the numeric range that this model uses for classifying what is an HMR. I'm looking for an answer like, "Genomic regions that are anywhere between 5 and 20% methylated are considered HMRs." What is this range for HMRs? And also, what it would be for a HYPERmethylated region?

Thank you!

bdecato commented 3 years ago

Hi @vagan21,

Nowhere do you say what is the numeric range that this model uses for classifying what is an HMR.

The strength of our hidden Markov model approach is that we do not use a predefined range for classifying an HMR: we learn it from the data, and therefore the mean methylation level inside and outside of HMRs (or hyperMRs) will differ from sample to sample, but represent the highest likelihood segmentation of the data into two beta-binomial emission distributions.

You can take a look at the parameter-out files to get the parameters of these emission distributions, which should give you a sense of (for each sample) the distribution of methylation levels for each CpG inside and outside HMRs.