const-ae / glmGamPoi

Fit Gamma-Poisson Generalized Linear Models Reliably
103 stars 14 forks source link

regarding the "note of caution" #24

Closed malcook closed 3 years ago

malcook commented 3 years ago

Regarding your remark:

A note of caution: applying test_de() to single cell data without the pseudobulk gives overly optimistic p-values. This is due to the fact that cells from the same sample are not independent replicates! It can still be fine to use the method for identifying marker genes, as long as one is aware of the difficulties interpreting the results.

... do you have any guidance as to approaches toward accounting for the overly optimistic p-values?

Naively, I suppose that without knowing how the cell-to-cell variability of expression depends upon the gene, the experimental condition, and possibly even their interaction, there is really no hope for such an accounting. Does this agree with your understanding, or, is there more that can be accomplished with the data at hand.

const-ae commented 3 years ago

Naively, I suppose that without knowing how the cell-to-cell variability of expression depends upon the gene, the experimental condition, and possibly even their interaction, there is really no hope for such an accounting. Does this agree with your understanding, or, is there more that can be accomplished with the data at hand.

Yes, I agree. Without measuring how much expression varies between individuals, it will always be difficult to make a prediction of how likely an observation is to replicate in the next experiment (which is roughly what the FDR measures).