YosefLab / ImpulseDE2

37 stars 10 forks source link

Using ImpulseDE2 to analyze DNA methylation data #36

Open lovebaboon1989 opened 1 year ago

lovebaboon1989 commented 1 year ago

Hi there, We are working on longitudinal data from DNA methylation, and I know ImpulseDE2 deals great with time course data for RNAseq datasets and its hypothesis is on a NB distribution for gene counts, so I am wondering whether we could use ImpulseDE2 to analyze the time course data for DNA methylation? My worry is that our methylated count matrix is a percentage of methylated sites, which means the count matrix are values ranging from 0 to 1, which might seem not suitable to a NB distribution. Would you please give some advice on whether or not we may use ImpulseDE2 to analyze this DNA methylation dataset? I would appreciate it a lot! Best, Weiqian

davidsebfischer commented 1 year ago

Hi Weiqian, indeed, you might run into violations of the distributional assumptions of ImpulseDE2 here. One could in principle set up a model like ImpulseDE2 with a Bernoulli / Gaussian or other likelihood instead of an NB likelihood, we have not done that though. The same goes for generalized linear models with spline basis space, you could opt for more appropriate likelihoods depending on the framework. You could hack around this by discretizing your counts into 0 and 1 and trying ImpulseDE2, but as mentioned before the assumptions might be violated which would lead to weird p values potentially. Hope that helps a bit at least!

lovebaboon1989 commented 1 year ago

Hi David, Thanks a lot for the quick response and helpful suggestions. I would look into the distribution of our methylated data and think of next steps..