theislab / scCODA

A Bayesian model for compositional single-cell data analysis
BSD 3-Clause "New" or "Revised" License
147 stars 24 forks source link

How to approach inference on data with multiple timepoints #31

Closed fbrundu closed 3 years ago

fbrundu commented 3 years ago

Hi,

I wonder if you have any suggestion on how to sample on data with multiple timepoints. I am running it now by using timepoint as an additional covariate, however, I was trying to understand if it would be more correct to run a different model for each timepoint. I am currently testing the model "condition + timepoint". The reason I used all data (multiple timepoints) was that I thought it might help and speed up the inference.

Thanks!

johannesostner commented 3 years ago

Hi!

The formula parameter and the resulting design matrix in scCODA work in the same way as a linear model. By setting the time as one covariate, scCODA will check for linear effects over time on your cell population. This might not be flexible enough, though, since time series data is usually correlated within itself and exhibits nonlinear patterns. scCODA can not take this into account currently.

Therefore, I'd use one of two approaches that are possible with the current state of scCODA:

I know that this might be tedious for data with many different time points, but there is currently no other way to use scCODA in its current state with such data that I can recommend.

fbrundu commented 3 years ago

Thanks for the explanation. It makes sense.