dohlee / metheor

:comet: Ultrafast DNA methylation heterogeneity calculation from bisulfite alignments (Lee et al., PLOS Computational Biology. 2023)
GNU General Public License v3.0
41 stars 8 forks source link

Metheor only calculate methylation diversity based on CpG site? #16

Open feihongloveworld opened 1 year ago

feihongloveworld commented 1 year ago

Metheor only calculate methylation diversity based on CpG site?

dohlee commented 1 year ago

Yes, it considers only CpG sites.

feihongloveworld commented 1 year ago

but MHL was designed on MHB which contain more than 3 CpG sits. so I'm wondering weather MHLs based on a CpG sits are different with MHLs based on MHB

dohlee commented 1 year ago

That's right. MHL was designed for a genomic region in the original paper, but CpG-wise MHL was then proposed in Scherer et al., 2020, which is the benchmark paper for Metheor. Please refer to https://doi.org/10.1093/nar/gkaa120. Basically, you can think that a set of sequencing reads covering each CpG defines the genomic region considered for MHL calculation.

feihongloveworld commented 1 year ago

I read Scherer's paper just now. whether PM and ME only are calculated based on the 4-CpG window? if the CpG number is more than 4 in a block, should I use overlapped windows with step 1 to get PM and ME?

dohlee commented 1 year ago

Yes, PM and ME values are assigned for four consecutive CpGs (i.e. CpG quartets). If you want to compute PM and ME in certain genomic block, you can simply average all PM/ME values for CpG quartets overlapping with that genomic block.