theislab / scCODA

A Bayesian model for compositional single-cell data analysis
BSD 3-Clause "New" or "Revised" License
147 stars 24 forks source link

Test All Pairwise Comparisons #26

Closed kanefos closed 3 years ago

kanefos commented 3 years ago

Hi there,

My dataset contains three conditions (A, B, C), but none of them are a control. That is, I would be interested in compositional changes between A & B, A & C, and B & C. Is there a way to configure scCODA or the formula parameter that doesn't assign a single value to the control? (removing patsy's "treatment coding"?)

If not, an alternative I may employ (and would be grateful for feedback regarding!) is based on this response: running scCODA multiple times with one of my three conditions as the Control a third of time (correcting for the +/- results in the final summation of credible non-zero results). Would this be functional?

Thanks a lot!

johannesostner commented 3 years ago

Hi! If you have three conditions and want to look at changes between all of them, I can think of two options, based on the question you ask. Assuming that your conditions right now are a column in adata.obs that is sth like "A,A,B,B,C,A,B,...":

If you want to know what changes between conditions (i.e. whats the difference between treatment A and treatment B), then you can remove all samples of one condition from the data and compare the other two, then repeat this 3 times (A & B, A & C, and B & C), as you described. I'm not quite sure what you mean with your last sentence, though. The discussion you linked was about selecting a reference cell type (that is assumed to stay constant under the condition(s)), not about conditions.

If you want to know what difference the presence of a condition makes (i.e. treatment A vs. not treatment A), you can just make new (binary) columns in adata.obs that represent this (is_A = (1,1,0,0,0,1,0,...)) for all conditions and use these in the formula, one at a time. You might take these results with a grain of salt, though, since your samples with is_A = 0 have either condition B or C. If sth happens in B and C, but not in A, you will get an effect on A. As long as you don't have samples where none of the conditions is present, there's no way to distinguish these interpretations.

kanefos commented 3 years ago

Ok perfect, thank you. That makes perfect sense, I will simply remove a condition and run the three contrasts as you suggested. You second suggestion was also informative, thank you.

mariecrane commented 1 year ago

I am working on a similar analysis of pairwise comparisons between 4 groups and would like to know what your recommendation is for correcting for multiple comparisons? I know the inclusion probability is not the same as a p-value but would you recommend making some adjustment (maybe decreasing the FDR threshold?) to account for the fact that I am making multiple comparisons?

johannesostner commented 1 year ago

Accounting for multiple testing is equivalent to controlling the FDR at your preferred level. Therefore, you can just set the FDR threshold as outlined in our tutorials.