Closed lucy924 closed 4 months ago
Hello @lucy924,
Significance tests over regions are tricky to get right in a general way. As the regions get larger or the number of modified bases gets more dense, the number of regions that end up being "significant" increases simply because the test will become over powered. I appreciate that experimenters often want some kind of decision function with which to say "these are differently methylated regions". Here are a couple of ideas:
--segment
and decide a region is different if the whole region or some proportion is labeled as "different". One thing I'd like to add soon is to emit the posterior probabilities in the segmentation output, so you could say that "this region is labeled as 'different' with X probability", I just haven't gotten around to implementing that yet.
Sorry I don't have a more concrete suggestion, let me think about it a little more.
Thank you for your advice! I'll explore more with these in mind.
Great, feel free to re-open this issue if you have any additional questions. I'll ping here if I think of some additional advice.
Hi, I'm looking for some advice. I've been using
dmr multi
on version 0.2.8 with the regions as CpG Islands. This does not output the MAP-based p-value, as I understand it because this is on regions not single sites. I'm not a statistician - I have looked at issues #93 and #122 but I was wondering if you had any advice on finding significance of the score values using the regions option? My experiment setup is the same as @EpiAllele mentioned in #122 . Most of my scores are <20 but in a single paired test I have a few between 20 and 40, one at 68 and one at 282. The other pairs have a similar type of spread, although not with any higher than 100. I appreciate the great package! Thank you!