satijalab / sctransform

R package for modeling single cell UMI expression data using regularized negative binomial regression
GNU General Public License v3.0
208 stars 33 forks source link

variable.features.n didn't change the number of genes used #124

Closed Evenlyeven closed 2 years ago

Evenlyeven commented 2 years ago

Hi,

Firstly, thanks for this amazing package!

I just noticed that regardless of variable.features.n I set, it is always "2000 genes" used. Or do I have misunderstanding of the number of genes used in the process?

Thank you very much in advance!

c(Er_treated.sct, control.sct) %<-% lapply(X = c(Er_treated.sub, control.sub), FUN = function(x){ SCTransform(x, variable.features.n = 4000) })

Calculating cell attributes from input UMI matrix: log_umi Variance stabilizing transformation of count matrix of size 17373 by 5045 Model formula is y ~ log_umi Get Negative Binomial regression parameters per gene Using 2000 genes, 5000 cells |===============================================================================================================================| 100% Found 130 outliers - those will be ignored in fitting/regularization step

Second step: Get residuals using fitted parameters for 17373 genes |===============================================================================================================================| 100% Computing corrected count matrix for 17373 genes |===============================================================================================================================| 100% Calculating gene attributes Wall clock passed: Time difference of 2.110177 mins Determine variable features Place corrected count matrix in counts slot Centering data matrix |===============================================================================================================================| 100% Set default assay to SCT Calculating cell attributes from input UMI matrix: log_umi Variance stabilizing transformation of count matrix of size 19138 by 7567 Model formula is y ~ log_umi Get Negative Binomial regression parameters per gene Using 2000 genes, 5000 cells |===============================================================================================================================| 100% Found 107 outliers - those will be ignored in fitting/regularization step

Second step: Get residuals using fitted parameters for 19138 genes |===============================================================================================================================| 100% Computing corrected count matrix for 19138 genes |===============================================================================================================================| 100% Calculating gene attributes Wall clock passed: Time difference of 2.398781 mins Determine variable features Place corrected count matrix in counts slot Centering data matrix |===============================================================================================================================| 100% Set default assay to SCT There were 50 or more warnings (use warnings() to see the first 50)

saketkc commented 2 years ago

The 2000 you see here is the default number of genes for estimating the Negative Binomial parameters (you can change this by using n_genes parameter to SCTransform. The variable features are set to 3000 by default which can be confirmed by length(VariableFeatures(object))