Possible Issue with power calculation

pezantn commented 1 year ago

When using the power.detect.celltype function, an increase in nSamples leads to a lower power. E.g. power.detect.celltype(nCells=3600,min.num.cells = 60, cell.type.frac =seq(0.01,0.05,0.001),nSamples = 20) has universally higher power than power.detect.celltype(nCells=3600,min.num.cells = 60, cell.type.frac =seq(0.01,0.05,0.001),nSamples = 40). This seems counter intuitive as power should increase when sample size increases. Thanks!

KatharinaSchmid commented 1 year ago

Hi,

thanks for your question and sorry for the late answer (holidays). It depends on how you define the "power". In our definition, we estimate the probability to detect at least min.num.cells in each sample. So a higher number of samples means also that more cells of the cell type of interest need to be detected. Our rationale behind was that for a cell type specific eQTL or DE analysis, you need to find the cells of interest in each sample. Mathematically, we calculate the overall probability as the probability to detect the cell type in an individual to the power of the sample size. So it is correct that the power decreases according to this definition.

In contrast, of course, if you only want to detect a certain number of cells overall - and not in each sample, the power should increase the more cells you have. You could mimic this in our function by running: power.detect.celltype(nCells * nSamples, min.num.cells,cell.type.frac, nSamples=1)

For example in your scenario:

> power.detect.celltype(3600 * 20, 60,0.001, nSamples=1)
[1] 0.9331444
> power.detect.celltype(3600 * 60, 60,0.001, nSamples=1)
[1] 1

I hope this explanation was helpful. Let me know in case you have further questions.

Regards, Katharina

Cem-Gulec commented 1 year ago

@KatharinaSchmid should we close this issue too?

heiniglab / scPower

Possible Issue with power calculation #24