satijalab / sctransform

R package for modeling single cell UMI expression data using regularized negative binomial regression
GNU General Public License v3.0
203 stars 33 forks source link

SCTransform: Error with merged datasets #103

Closed arshiyaakeel closed 3 years ago

arshiyaakeel commented 3 years ago

Hello Christoph, 

I have merged 20 public single cell datasets using merge command in R. This combined data has: Total cell: 75017 Total gene: 12489 Total patient: 269 I run SCTransform on on it, allCell <- SCTransform(seuratObject, batch_var = "Patient", verbose = FALSE)

but I am getting the following error:   Error in if (sum(outliers) > 0) { : missing value where TRUE/FALSE needed Calls: SCTransform -> do.call -> vst -> reg_model_pars

However, I can successfully run SCTransform on a subset (10 merged datastes) of this data which includes: Total cell: 41693 Total gene: 15047 Total patient: 124 I am using Seurat version 3.1.4 together sctransform version 0.2.1.

is this issue related to the size of the input data matrix? Could you make some comments/suggestions on it?

Many thanks in advance. Best Arsh. 

ChristophH commented 3 years ago

Hi Arsh,

this might be a problem when the model parameters are regularized. Hard to say without a reproducible example. If you can, try running the sctransform::vst function directly: vst_out <- vst(umi = seuratObject$RNA@counts, cell_attr = seuratObject@meta.data, batch_var = "Patient", residual_type = 'none', do_regularize = FALSE) If that command works, and you share the resulting object with me, I will have a closer look.

I have never run sctransform on that many batches. Does the normalization work if you do not specify a batch_var?

arshiyaakeel commented 3 years ago

Hello Christoph,  Thanks for your prompt reply. I will now try running the sctransform::vst function directly and will share the resulting object with you. 

_Does the normalization work if you do not specify a batchvar? Yes, normalization works without a batch variable.

Thank you. Best Arsh

arshiyaakeel commented 3 years ago

Hello Christoph,

sctransform::vst function is finished on 12489 genes and 74801 cells.

You can download the resulting object and discription of run from the link below. https://we.tl/t-esBoC5JZp

Looking forward to hearing from you.

Many thanks in advance. Best Arsh.

ChristophH commented 3 years ago

Could you also provide the meta data, i.e. seuratObject@meta.data ?

arshiyaakeel commented 3 years ago

Hello Christoph,

You can download the Seurat Object:

https://we.tl/t-9rvgsHGM6

Thank you.

ChristophH commented 3 years ago

sctransform fails because you have patients in your data set with very few cells. Specifically the two patients with just one(!) cell are causing problems because no models can be estimated for them.

You are using Patient as batch variable, but did you check that it is really necessary to do so? You end up with many batches with few cells, e.g. more than half your batches have fewer than 200 cells. Perhaps you could do without a batch indicator, or maybe Dataset as a batch indicator is sufficient.

arshiyaakeel commented 3 years ago

Hello Christoph, Thanks for the explanation. yes we need to use only Patient as batch variable for our analysis. We can now apply a cut off on number of cell per patient, e.g. minimum 50 cells for a patient to be included for the analysis. Do you think it should work? Thank you.

arshiyaakeel commented 3 years ago

Hello Christoph, I could sucessfully run the normalization after appling a cut off on number of cell per patient, e.g. minimum 100 cells for a patient to be included for the analysis.

Recently I updated my R to use Seurat version 4.0 but got the follwoing error.

_> allCell <- SCTransform(seuratObject, batch_var = "Patient", verbose = TRUE) Warning: The 'show_progress' argument is deprecated as of v0.3. Use 'verbosity' instead. (in sctransform::vst) Calculating cell attributes from input UMI matrix: log_umi Error in is.nan(rel_attr) : default method not implemented for type 'list'_

I set back to R 3.6.3 and Seurat 3.1.4 but still getting the same error.

Do you think it is due to mismatches of correct version of different packages?

Thank you veyr much. Best Arsh.

ChristophH commented 3 years ago

Hi Arsh,

I cannot reproduce the problem when calling sctransform::vst directly with the data that you shared. E.g.

tab_patient <- table(seuratObject@meta.data$Patient)
sel_patient <- names(tab_patient)[tab_patient >= 100]
sel_cells <- seuratObject@meta.data$Patient %in% sel_patient
counts <- seuratObject$RNA@counts[, sel_cells]
md <- seuratObject@meta.data[sel_cells, ]

my_vst_out <- sctransform::vst(counts, cell_attr = md, batch_var = 'Patient')

does not throw this error.

You should use the latest version of Seurat (4.0.2) and sctransform (develop branch) and try again. If you still see this error, please share your seuratObject@meta.data.

arshiyaakeel commented 3 years ago

Hello Christoph,

I am now using Seurat version 4.0.2 and sctranform 0.3.2 with R 4.1.0 but getting the same error again.

_my_vst_out <- sctransform::vst(counts, cell_attr = md, batchvar = 'Patient') _Calculating cell attributes from input UMI matrix: log_umi Error in is.nan(rel_attr) : default method not implemented for type 'list'_

Please download seuratObject@meta.data via the link below https://we.tl/t-BgoRnLpjr6

PS: This metdadata object contains all patiens (before cut off )

Thank you. Best Arsh

arshiyaakeel commented 3 years ago

Hello Christoph, Here you can download Seurat object throwing the same error after updating R https://we.tl/t-oDDW372NJ8

load("seuratObjectCD4_10X.Rdata") tab_patient <- table(seuratObject@meta.data$Patient) sel_patient <- names(tab_patient)[tab_patient >= 100] sel_cells <- seuratObject@meta.data$Patient %in% sel_patient counts <- seuratObject$RNA@counts[, sel_cells] md <- seuratObject@meta.data[sel_cells, ]

my_vst_out <- sctransform::vst(counts, cell_attr = md, batch_var = 'Patient')

_Calculating cell attributes from input UMI matrix: log_umi Error in is.nan(rel_attr) : default method not implemented for type 'list_

Thank you in advance for your help. Best Arsh

ChristophH commented 3 years ago

Again, I cannot reproduce the error. I see:

Variance stabilizing transformation of count matrix of size 12479 by 72037
Model formula is y ~ (log_umi) : Patient + Patient + 0
Get Negative Binomial regression parameters per gene
Using 2000 genes, 72037 cells

followed by model fitting.

Please make sure you have the latest (develop) version of sctransform installed. To install from the develop branch run remotes::install_github("ChristophH/sctransform@develop"), then restart R. To validate which version is installed run packageVersion(pkg = 'sctransform') The output should be 0.3.2.9007

To speed up the model fitting step, you might want to try method = 'qpoisson' or method = 'glmGamPoi'

arshiyaakeel commented 3 years ago

Hello Christoph, I run again with updated version of sctransform and got another error:

_> my_vst_out <- sctransform::vst(counts, cell_attr = md, batch_var = 'Patient') Calculating cell attributes from input UMI matrix: log_umi Variance stabilizing transformation of count matrix of size 12479 by 72037 Model formula is y ~ (logumi) : Patient + Patient + 0 Get Negative Binomial regression parameters per gene Using 2000 genes, 72037 cells   |                                                                                                                             |   0% warning: solve(): system is singular; attempting approx solution warning: solve(): system is singular; attempting approx solution warning: solve(): system is singular; attempting approx solution warning: solve(): system is singular; attempting approx solution Error in h(simpleError(msg, call)) :   error in evaluating the argument 'x' in selecting a method for function 't': missing value where TRUE/FALSE needed In addition: There were 22 warnings (use warnings() to see them)

Any idea about it?

Thank you.

Best

Arsh

arshiyaakeel commented 3 years ago

Hello Christoph,

Finally, I could manage to perform normalization without any error after updating my R and all the packages. Thank you very much for your help.

Many thanks. Best Arsh.