HelenaLC / muscat

Multi-sample multi-group scRNA-seq analysis tools
166 stars 33 forks source link

mmDS using 'nbinom' #45

Closed yuyingxie closed 4 years ago

yuyingxie commented 4 years ago

I notice that in the code

' if (is.null(sizeFactors(x))) { cd$ls <- log(colSums(y)) } else { cd$ls <- sizeFactors(x) }

'

If we already calculated the sizeFactor, why do we need to caculate it again?This createds an issue. If we only want to test a few of the gene, the sizeFactor is better to be calculated by all the genes instead of the handful gnees.

yuyingxie commented 4 years ago

I mean we calculate 'sizeFactors' using computeSumFactors(B, clusters=clusters)

but here, it uses 'log(colSums(y))'.

Which one should be used?

plger commented 4 years ago

As the code you quote clearly indicates, log(colSums(y)) is used only if is.null(sizeFactors(x))), in other words if you haven't provided pre-calculated size factors.

yuyingxie commented 4 years ago

My question is the differnce between the two ways of calculation.

plger commented 4 years ago

computeSumFactors should use the pooling method, which has been argued to be more robust, and it will scale the factors so that they average to 1 (which shouldn't make a difference for its use as offset in the model fitting), but otherwise the results should be roughly the same.