theislab / scCODA

A Bayesian model for compositional single-cell data analysis
BSD 3-Clause "New" or "Revised" License
147 stars 24 forks source link

Bootstrap or splitting my samples #74

Closed dekan-aleksandr closed 1 year ago

dekan-aleksandr commented 1 year ago

Hi,

I have 6 samples from 6 different time points. Which is 1 replicate per group. Would it be reasonable to bootstrap or split each sample to have more replicates?

johannesostner commented 1 year ago

Hi @dekan-aleksandr,

I'd advise against it. Splitting each sample can easily produce subsamples with heavily different compositions, which may totally skew the results. As for bootstrapping, it won't give you more information than what you have in your one sample per group, so it does not solve your problem. Also, I don't see how you want to compare different bootstrap samples. The "Final Parameter" values for each cell type of scCODA are not necessarily comparable between runs on different data, since they are relative estimates that depend on the composition of all cell types. Besides that, the computational cost for a good bootstrap (10,000 replicates) would be massive.

In scCODA, the Bayesian framework actually makes it possible to gain results with only one sample per group, although the changes must be very extreme become credible (see the simulated data benchmark in our paper).

dekan-aleksandr commented 1 year ago

Thank you @johannesostner ! It solves my problem.

And I want to mention that I did bootstrap (10k replicates in 6 groups) which took only 30 seconds to complete.