Pseudo-bulk or mixed-effects modeling approach to differential testing of reaction scores

Thank you for developing this great tool! I am wondering if you have considered other approaches to testing differential reaction scores between conditions or cell types of interest. Your current suggested approach is to use a Wilcoxon rank-sum test, but this approach does not take into account the pseudo-replication of multiple cells from a single biological sample. This introduces bias when there are multiple samples from the same experimental condition, since cells from the same sample are not statistically independent.

The strategies to address pseudo-replication in differential expression analysis of single-cell RNA-seq data are pseudo-bulk or mixed-effects modeling. What do you think of applying these strategies to analyze Compass results? I know Compass theoretically works on bulk RNA-seq data, so I've considered aggregating (summing) my counts across cells within each sample/cell type to generate pseudo-bulk data and running Compass on that. An added benefit of this approach is that it would drastically reduce the computing time since it would only have to calculate penalties for a handful of samples instead of thousands of cells. However, this would reduce the granularity provided by single-cell measurements. I have also considered modeling reaction scores using a mixed-effects model with the donor/sample as a random effect, which is another way to account for pseudo-replication.

What do you think of these approaches? I would greatly value your thoughts or concerns about their validity in this context.

YosefLab / Compass

Pseudo-bulk or mixed-effects modeling approach to differential testing of reaction scores #103