smithlabcode / methpipe

A pipeline for analyzing DNA methylation data from bisulfite sequencing.
http://smithlabresearch.org/methpipe
67 stars 27 forks source link

MethPipe ASM score query #174

Closed kashiff007 closed 1 year ago

kashiff007 commented 3 years ago

Dear MethPipe Team,

I am using methylkit and struggling to understand ASM score from allelicmeth output.

I know the probability score of ASM varies from 0 to 1 and lower score means lower ASM and higher means higher ASM. At the same time I understand higher ASM means higher number of MU and UM reads.

But many tuples from allelicmeth output are confusing. for example:

chr1 100570   +   CpG     1       0       0       0       0       0
chr1 101000   +   CpG     1       31      0       0       0       31

Here why ASM score is 1 despite no coverage or no MU and UM reads?

I tried to read the documentations (http://smithlabresearch.org/downloads/methpipe-manual.pdf and https://www.pnas.org/content/109/19/7332.long) several times but was unable to get it (may be i need to read once more). I will be highly grateful if you could explain or provide a proper source which I might be missing.

Kind Regards, Kashif

guilhermesena1 commented 3 years ago

Hello,

This is an excellent question and one that we have addressed in more detail in our documentation. The name "score" for the fifth column of the output is actually misleading, this value is actually a p-value from a Fisher's exact test on the probability that allele-specific methylation counts are observed by chance. When counts are 0, the p-value is 1, since no ASM is observed.

In other words, scores should be interpreted as "lower is better", or, lower scores provide more evidence for ASM, rather than the other way around. In the next versions of methpipe we might switch this to -log(p) so it can be more easily interpreted as a score.

kashiff007 commented 3 years ago

Thanks for your simple and helpful explanation. Best, Kashif