smithlabcode / dnmtools

Tools for analyzing DNA methylation data
https://dnmtools.readthedocs.io
GNU General Public License v3.0
24 stars 8 forks source link

SYM - collapsing counts for other methylation context #41

Open olaraym opened 1 year ago

olaraym commented 1 year ago

Hi @iromeo @songqiang @saketkc @egor-dolzhenko

Thanks for this great tool, I find using the sym option of dnmtools useful for making sense of CpG context methylated position, however, I guess it does not work for CXG, CCG, and CHH context which is important for those that might want to use this tool for plant DNA methylation calling.

I need your suggestion in processing the CHG (CXG, CCG) and CHH outputs for downstream analysis since the dnmtools sym was not designed to handle the processing of CHH and CHG methylation calls. what will you advise I do since in my case I need this methylation information to understand methylation patterns in my plant of interest? Thanks.

Regards.

andrewdavidsmith commented 1 year ago

@olaraym I personally can't advise you in your case. What I recall about non-CpG methylation in plants (specifically Arabidopsis) is that the "symmetric" methylation at non-CpG sites is that strand might still matter. I vaguely recall that retrotransposon methylation is strand-specific for certain contexts. So it's your choice as to whether it makes sense to collapse these. Note that CHH is not symmetric in any sense.

I have no problem adding this functionality to sym, but it would help if I had a biological true-positive case where there is evidence that the analysis should collapse, for example, both strands of a CHG site.

olaraym commented 1 year ago

Thanks @andrewdavidsmith for your help

This is understandable, I feel like CpG and CHG which are symmetric can can be collapsed since both strand information can help give better evidence of methylation at a particular cytosine. I know that the repetitive sequences in plants are grossly methylated in the CHH context and the fact that this is not symmetric makes collapsing it, not something worth doing however, it might greatly impact the downstream analysis in my opinion.

Regards