satijalab / sctransform

R package for modeling single cell UMI expression data using regularized negative binomial regression
GNU General Public License v3.0
203 stars 33 forks source link

scale_factor #128

Closed z5ouyang closed 2 years ago

z5ouyang commented 2 years ago

Hi, Thanks for the implementing this tool! I was looking into the options of "vst", and noticed that "scale_factor", is there a more detailed documents than "Replace all values of UMI in the regression model by this value. Default is NA". Is this a scaler? or a vector for all genes? Why/how the user should consider this option?

If I have two 10X run/reaction whose sequence depths are different (one is twice more than other other, for instance). Could I use "scale_factor" to SCTransform them separately? or should merge them and SCTransform together with "batch_var".

Thanks in advance.

saketkc commented 2 years ago

Hi @z5ouyang,

Is this a scaler? or a vector for all genes? Why/how the user should consider this option?

Yes it is a scalar. Setting NA results in it effectively being replaced by median sequencing depth ( median(nCount_RNA)). It is only used during the reverse regression step to generate the corrected counts (and does not affect calculation of pearson residuals even if you have multiple datasets).

If you have two sequencing runs with different depths, I would recommend normalizing them separately (to learn dataset specific noise model). To perform DE, I would recommend looking at this vignette. batch_var is currently not supported in Seurat::SCTransform(), but you can definitely use it. Here is a vignette with using batch_var outside Seurat. You can replace the scale.data slot with vst_out$y in this vignette and continue with standard Seurat analysis that you would do after invoking SCTransform().

z5ouyang commented 2 years ago

Thank you very much for the explanation!