Closed martinjvickers closed 6 years ago
Hi @martinjvickers,
Thanks for using our pipeline. radmeth merge
uses uncorrected p-values to create a composite p-value for the DMR using the Liptak-Stouffer method for combining probabilities (See @egor-dolzhenko 's paper https://bmcbioinformatics.biomedcentral.com/articles/10.1186/1471-2105-15-215). Egor may be able to give more insight into the appropriateness of applying FDR before versus after the merge step, but my intuition tells me that by combining probabilities you're reducing the overall number of tests you're analyzing downstream, and therefore FDR after combining should be fine.
More generally, I tend to avoid the radmeth merge
function because I believe it gives very strict results (typically the DMRs are quite small: only a few CpG sites in length at most) and because of this ambiguity about where to apply FDR. Rather, I prefer to take two alternate approaches to identifying DMRs using radmeth
:
1) I calculate the average methylation in a set of genomic features of interest (such as CpG islands or gene promoters) and treat them like CpG sites as input to radmeth
, where any "CpG site" called as differential is actually a region, or
2) I call DM CpGs using radmeth
at single-CpG resolution and then count the number and direction of the significant CpGs (using the FDR-corrected p-value) in my regions of interest, ranking them by the number of significant CpGs in the region.
I hope this was helpful!!
Ben
Thanks for the question, Martin. You observed correctly that the CpG count reported by radmeth merge
is based on the original, unmerged p-values. While the significance of a region is assessed using the combined p-values, knowing the number of CpGs within a DMR region that exhibit differential methylation is a useful metric for stratifying significant DMR. It helps to separate regions where each CpG exhibits modest amount of differential methylation from those where many CpGs individually exhibit strong differential methylation (both types of regions could lead to significant merged p-values).
Hi @bdecato and @egor-dolzhenko thank you very much for those clarifications, that was very useful.
I'm a bit confused by what is happening with the
radmeth
part of section 3.2.2 in the methpipe-manual. I've been trying to run DM analysis on some data with three reps of each control/case.Firstly, in the manual it mentions;
However column 5 (
$5
) is the original p-value, column 7 is the FDR-corrected p-value.This leads me to where I'm getting confused. All my analysis has gone well until the
radmeth merge
step, specifically relating to the number of sites significantly methylated within the DMR. I have the following DMRs created using;As you can see the number of significantly differentially methylated CpGs in this case is low (1 or 2). When looking at the sites that contribute to each of these DMRs I can see that rather than the FDR-corrected p-value being used to determine the number of significantly differentially methylated CpGs, it's the original p-value.
Is this the way it's supposed to be, shouldn't it be using the FDR-corrected p-value to determine the number of significantly differentially methylated sites?
Many thanks.