theislab / scCODA

A Bayesian model for compositional single-cell data analysis
BSD 3-Clause "New" or "Revised" License
141 stars 23 forks source link

How to compare multiple groups with sccoda #94

Closed karenlawwc closed 1 month ago

karenlawwc commented 4 months ago

Hi, I am wondering if I have three groups of variables I want to compare cell composition using sccoda, what would be the best way to do it? Since using different groups as reference would result in different differential expression result.

What would be the best practice you would recommend? Subsetting the data and only comparing two groups at the same time?

Thanks!

johannesostner commented 4 months ago

Hi @karenlawwc!

I'm not 100% sure what the term "reference" in your question refers to. Do you mean the categories in your composition (cell types) or the levels of a covariate variable (e.g. control - condition 1 - condition 2)?

karenlawwc commented 4 months ago

Sorry for being not clear. I have three levels of covariate variable but none of them is control. I am wondering is it best to compare between Treatment 1 vs Treatment 2, Treatment 2 vs Treatment 3 and last Treatment 1 vs Treatment 3. However, if I do it that way, some cell types would be significant using FDR = 0.05 if I compare Treatment 1 vs 2 while it is no longer significant if I do Treatment 2 vs 1. So I just want to know what is the best practice when comparing multiple covariate variables? Thanks!

johannesostner commented 4 months ago

Yes, I'd do a pairwise comparison of all three levels.

Changes in compositional data are not symmetric (say, an increase by x units from condition 1 to condition 2 corresponds to an increase by y%. The other way around, a decrease by x units from condition 2 to condition 1 will amount to a different relative change). Therefore, some effects might be detected in one direction, but not in the other If you don't have a clear baseline category, you can either just decide on one direction or do the model in both directions and only take the common effects for a more conservative estimate.

wdg118 commented 1 month ago

Hi @johannesostner,

What metric do we use for pairwise comparisons ? Final Parameter ?

Thanks!

johannesostner commented 1 month ago

Hi @wdg118! Each pairwise comparison works just as a normal scCODA model (Final parameter != 0 --> credible effect). For comparing runs of scCODA on different pairs of conditions, you can only get a qualitative comparison through the final parameter. The numerical vaues are not easily comparable. Instead, you can look at the log-fold change in the summary. Those are comparable quantitatively as well.

wdg118 commented 1 month ago

Hi @johannesostner ,

Thanks for clarifying this. I appreciate it! Also I'm assuming the model accounts and corrects for multiple comparisons? I believe this can be inferred from the fact you can change the FDR right ?

Thanks!

johannesostner commented 1 month ago

Exactly! You can set the FDR to a desired level and the model will select the credible effects based on their bayesian inclusion probability, controlling for multiple comparisons. You can read more about that in the paper if you want.