Understanding the purpose of corrected counts

danielcgingerich commented 10 months ago

Trying to understand what exactly corrected counts are. I do not see how the corrected counts are different than the raw count matrix. Ive looked at the source code for correct_counts in this repository and identify the source of my misunderstanding. from line 237 in denoise.R

mu <- exp(tcrossprod(coefs, regressor_data_orig))
variance <- mu + mu^2 / theta
y <- as.matrix(umi[genes_bin, , drop=FALSE])
pearson_residual <- (y - mu) / sqrt(variance)
# generate output
mu <- exp(tcrossprod(coefs, regressor_data))
variance <- mu + mu^2 / theta
y.res <- mu + pearson_residual * sqrt(variance)

mu + pearson_residual sqrt(variance) = mu + (y - mu) / sqrt(variance) sqrt(variance) = mu + y - mu = y.

Whats the purpose of this?

saketkc commented 10 months ago

This is explained in the paper. Briefly, the idea is to put all the cells to same sequencing depth (median) and then ask what would be observed counts (with minimal technical variation or mostly biological variance) given the model captures technical variation.

danielcgingerich commented 5 months ago

@saketkc Thank you!

satijalab / sctransform

Understanding the purpose of corrected counts #179