theislab / diffxpy

Differential expression analysis for single-cell RNA-seq data.
https://diffxpy.rtfd.io
BSD 3-Clause "New" or "Revised" License
179 stars 23 forks source link

mixed effects models #95

Open zhangguy opened 4 years ago

zhangguy commented 4 years ago

Hi, Thanks for developing this package! With the increasing throughput of single cell technology, a lot more samples (individuals in the context of human tissue) are being sequenced. The variation between samples are better modeled as a random effect. Is it possible to expand diffxpy to include mixed effects models, e.g., generalized linear mixed effects model? Thanks.

davidsebfischer commented 4 years ago

Hi @zhangguy, I have some thoughts and solutions on this, I am benchmarking this in August and will come back to you once this is done. Out-of-the-box mixed effect models are not supported right now. Best, David

davidsebfischer commented 4 years ago

Just a quick update, the release of this feature got delayed to mid september. Sorry for that.

aidarripoll commented 4 years ago

Hi, in line with @zhangguy comment, do you have some updates on the incorporation of mixed effect models? I'd like to know if the latest version of diffxpy supports the possibility to model both a random factor, such as "donor" (1|donor), and a fixed factor, such as "condition" in the same model, although I only want to test the fixed factor "condition"... Thanks!

davidsebfischer commented 4 years ago

Hi @aidarripoll, not yet, so far we only have an estimation backend for GLMs.

davidsebfischer commented 3 years ago

@aidarripoll @zhangguy this is now planned, we will interface statsmodels gaussian and poisson GLMM for this purpose and not provide our own estimation backend because this is relatively involved and we can role out this option faster this way. I will keep you updated on this issue!

aidarripoll commented 3 years ago

Hi @davidsebfischer, nice to hear this!! I think it could be a good approach to add this feature to your package, both in terms of time and work! I will wait for your updates! 👍

grst commented 2 years ago

Is there any progress on this?

I found this paper by Zimmermann et al. making a strong point in favor of mixed effects models for single-cell DE analysis, and it would be nice not having to leave the scanpy ecosystem for it.

davidsebfischer commented 2 years ago

Hi @grst, we have not done this yet, we should have capacity to address this within the next 2 months. Otherwise, I would be happy to guide contribution of above mentioned wrapper of the statsmodels estimation backend, too, if anybody wants to give that a try that we start working on it! It should be relatively lean code as this is just about running fits and extracting parameters.

zhangguy commented 2 years ago

@davidsebfischer @grst @aidarripoll all, not sure if you saw this recent new package: https://github.com/lhe17/nebula This package provides a negative binomial distribution model for the random effect, which might be better than the poisson model from the python statsmodels. It R but the estimators are written in C++, so it is possible to develop python binders for diffxpy, but not sure how much work it is.

davidsebfischer commented 2 years ago

Hi @zhangguy! I dont think that we will have capacity to write python binders in the near future, we would be happy to guide contribution, though, if somebody wants to do that!