satijalab / seurat

R toolkit for single cell genomics
http://www.satijalab.org/seurat
Other
2.27k stars 910 forks source link

SCTransform crashes or hangs #5973

Closed volkansevim closed 2 years ago

volkansevim commented 2 years ago

SCtransform crashes running on the pbmc3k dataset. It hangs when running on my dataset.

I re-installed sctransform from github before testing this: devtools::install_github("satijalab/sctransform", ref="develop")

library(Seurat)
library(ggplot2)
library(sctransform)
pbmc_data <- Read10X(data.dir = "~/prj/seurat_tut/pbmc/filtered_gene_bc_matrices/hg19/")
pbmc <- CreateSeuratObject(counts = pbmc_data)
pbmc <- PercentageFeatureSet(pbmc, pattern = "^MT-", col.name = "percent.mt")
pbmc <- SCTransform(pbmc, verbose = TRUE)
Calculating cell attributes from input UMI matrix: log_umi

Variance stabilizing transformation of count matrix of size 12572 by 2700

Model formula is y ~ log_uni

Get Negative Binomial regression parameters per gene

Using 2000 genes, 2700 cells

  |                                                                      |   0%
Error in h(simpleError(msg, call)): error in evaluating the argument 'x' in selecting a method for function 't': missing value where TRUE/FALSE needed
Traceback:

1. SCTransform(pbmc, verbose = TRUE)
2. do.call(what = "vst", args = vst.args)
3. vst(umi = new("dgCMatrix", i = c(70L, 166L, 178L, 326L, 363L, 
 . 410L, 412L, 492L, 494L, 495L, 496L, 525L, 556L, 558L, 671L, 684L, 
 . 735L, 770L, 793L, 820L, 859L, 871L, 908L, 926L, 941L, 966L, 998L, 
 . 1029L, 1057L, 1109L, 1313L, 1332L, 1362L, 1421L, 1546L, 1611L, 

Using method=glmGamPoi produces a different error:

Warning: Feature names cannot have underscores ('_'), replacing with dashes ('-')
Calculating cell attributes from input UMI matrix: log_umi
Variance stabilizing transformation of count matrix of size 12572 by 2700
Model formula is y ~ log_umi
Get Negative Binomial regression parameters per gene
Using 2000 genes, 2700 cells
  |                                                                      |   0%
warning: solve(): system is singular; attempting approx solution
Error in fitBeta_fisher_scoring(Y, model_matrix, exp(offset_matrix), dispersions,  :
  solve(): solution not found

sessionInfo()

> sessionInfo()
R version 4.1.1 (2021-08-10)
Platform: x86_64-conda-linux-gnu (64-bit)
Running under: Ubuntu 20.04.4 LTS

Matrix products: default
BLAS/LAPACK: /home/vsevim/software/anaconda3/lib/libmkl_rt.so.1

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
 [9] LC_ADDRESS=C               LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] BiocManager_1.30.18 sctransform_0.3.3   ggplot2_3.3.6
[4] sp_1.4-7            SeuratObject_4.1.0  Seurat_4.1.1

loaded via a namespace (and not attached):
  [1] plyr_1.8.7                  igraph_1.3.1
  [3] lazyeval_0.2.2              splines_4.1.1
  [5] listenv_0.8.0               scattermore_0.8
  [7] usethis_2.1.5               GenomeInfoDb_1.30.1
  [9] digest_0.6.29               htmltools_0.5.2
 [11] fansi_1.0.3                 magrittr_2.0.3
 [13] memoise_2.0.1               tensor_1.5
 [15] cluster_2.1.3               ROCR_1.0-11
 [17] remotes_2.4.2               globals_0.15.0
 [19] matrixStats_0.62.0          spatstat.sparse_2.1-1
 [21] prettyunits_1.1.1           colorspace_2.0-3
 [23] ggrepel_0.9.1               dplyr_1.0.9
 [25] callr_3.7.0                 crayon_1.5.1
 [27] RCurl_1.98-1.6              jsonlite_1.8.0
 [29] progressr_0.10.0            spatstat.data_2.2-0
 [31] survival_3.3-1              zoo_1.8-10
 [33] glue_1.6.2                  polyclip_1.10-0
 [35] gtable_0.3.0                zlibbioc_1.40.0
 [37] XVector_0.34.0              leiden_0.4.2
 [39] DelayedArray_0.20.0         pkgbuild_1.3.1
 [41] future.apply_1.9.0          BiocGenerics_0.40.0
 [43] abind_1.4-5                 scales_1.2.0
 [45] DBI_1.1.2                   spatstat.random_2.2-0
 [47] miniUI_0.1.1.1              Rcpp_1.0.8.3
 [49] viridisLite_0.4.0           xtable_1.8-4
 [51] reticulate_1.25             spatstat.core_2.4-4
 [53] stats4_4.1.1                htmlwidgets_1.5.4
 [55] httr_1.4.3                  RColorBrewer_1.1-3
 [57] ellipsis_0.3.2              ica_1.0-2
 [59] pkgconfig_2.0.3             uwot_0.1.11
 [61] deldir_1.0-6                utf8_1.2.2
 [63] tidyselect_1.1.2            rlang_1.0.2
 [65] reshape2_1.4.4              later_1.3.0
 [67] munsell_0.5.0               tools_4.1.1
 [69] cachem_1.0.6                cli_3.3.0
 [71] generics_0.1.2              devtools_2.4.3
 [73] ggridges_0.5.3              stringr_1.4.0
 [75] fastmap_1.1.0               goftest_1.2-3
 [77] processx_3.5.3              fs_1.5.2
 [79] fitdistrplus_1.1-8          purrr_0.3.4
 [81] RANN_2.6.1                  pbapply_1.5-0
 [83] future_1.25.0               nlme_3.1-157
 [85] sparseMatrixStats_1.6.0     mime_0.12
 [87] brio_1.1.3                  compiler_4.1.1
 [89] curl_4.3.2                  plotly_4.10.0
 [91] png_0.1-7                   testthat_3.1.4
 [93] spatstat.utils_2.3-1        tibble_3.1.7
 [95] glmGamPoi_1.6.0             stringi_1.7.6
 [97] ps_1.7.0                    desc_1.4.1
 [99] rgeos_0.5-9                 lattice_0.20-45
[101] Matrix_1.4-1                vctrs_0.4.1
[103] pillar_1.7.0                lifecycle_1.0.1
[105] spatstat.geom_2.4-0         lmtest_0.9-40
[107] RcppAnnoy_0.0.19            data.table_1.14.2
[109] cowplot_1.1.1               bitops_1.0-7
[111] irlba_2.3.5                 httpuv_1.6.5
[113] patchwork_1.1.1             GenomicRanges_1.46.1
[115] R6_2.5.1                    promises_1.2.0.1
[117] KernSmooth_2.23-20          gridExtra_2.3
[119] IRanges_2.28.0              parallelly_1.31.1
[121] sessioninfo_1.2.2           codetools_0.2-18
[123] MASS_7.3-57                 assertthat_0.2.1
[125] pkgload_1.2.4               SummarizedExperiment_1.24.0
[127] rprojroot_2.0.3             withr_2.5.0
[129] S4Vectors_0.32.4            GenomeInfoDbData_1.2.7
[131] mgcv_1.8-40                 parallel_4.1.1
[133] grid_4.1.1                  rpart_4.1.16
[135] tidyr_1.2.0                 DelayedMatrixStats_1.16.0
[137] MatrixGenerics_1.6.0        Rtsne_0.16
[139] Biobase_2.54.0              shiny_1.7.1
saketkc commented 2 years ago

I am unable to replicate this with pbmc3k dataset.

> library(ggplot2)
> library(sctransform)
 pbmc_data <- Read10X(data.dir = "./data/pbmc3k/filtered_gene_bc_matrices/hg19/")
 pbmc <- CreateSeuratObject(counts = pbmc_data)
Warning: Feature names cannot have underscores ('_'), replacing with dashes ('-')
pbmc <- PercentageFeatureSet(pbmc, pattern = "^MT-", col.name = "percent.mt")
> pbmc <- SCTransform(pbmc, verbose = TRUE)
Calculating cell attributes from input UMI matrix: log_umi
Variance stabilizing transformation of count matrix of size 12572 by 2700
Model formula is y ~ log_umi
Get Negative Binomial regression parameters per gene
Using 2000 genes, 2700 cells
  |====================================================================================| 100%
Found 147 outliers - those will be ignored in fitting/regularization step

Second step: Get residuals using fitted parameters for 12572 genes
  |====================================================================================| 100%
Computing corrected count matrix for 12572 genes
  |====================================================================================| 100%
Calculating gene attributes
Wall clock passed: Time difference of 27.15488 secs
Determine variable features
Place corrected count matrix in counts slot
Centering data matrix
  |====================================================================================| 100%
Set default assay to SCT
>  pbmc <- SCTransform(pbmc, verbose = TRUE, method="glmGamPoi")
Calculating cell attributes from input UMI matrix: log_umi
Variance stabilizing transformation of count matrix of size 12572 by 2700
Model formula is y ~ log_umi
Get Negative Binomial regression parameters per gene
Using 2000 genes, 2700 cells
  |=====================================================================================| 100%
Found 183 outliers - those will be ignored in fitting/regularization step

Second step: Get residuals using fitted parameters for 12572 genes
  |=====================================================================================| 100%
Computing corrected count matrix for 12572 genes
  |=====================================================================================| 100%
Calculating gene attributes
Wall clock passed: Time difference of 16.22976 secs
Determine variable features
Place corrected count matrix in counts slot
Centering data matrix
  |=====================================================================================| 100%
Set default assay to SCT

What is the output of

anyNA(pbmc_data)
volkansevim commented 2 years ago

anyNA(pbmc_data) is FALSE.

I just created a new conda environment and installed Seurat on it. SCTransform works fine under the new env. Below is the session info.

I'm closing the issue as my problem has been solved.

> sessionInfo()
R version 4.1.3 (2022-03-10)
Platform: x86_64-conda-linux-gnu (64-bit)
Running under: Ubuntu 20.04.4 LTS

Matrix products: default
BLAS/LAPACK: /home/aaa/software/anaconda3/envs/Renv/lib/libopenblasp-r0.3.20.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
 [9] LC_ADDRESS=C               LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] sctransform_0.3.3   ggplot2_3.3.6       sp_1.4-7
[4] SeuratObject_4.1.0  Seurat_4.1.1        BiocManager_1.30.18

loaded via a namespace (and not attached):
  [1] Rtsne_0.16            colorspace_2.0-3      deldir_1.0-6
  [4] ellipsis_0.3.2        ggridges_0.5.3        IRdisplay_1.1
  [7] base64enc_0.1-3       spatstat.data_2.2-0   leiden_0.4.2
 [10] listenv_0.8.0         ggrepel_0.9.1         fansi_1.0.3
 [13] codetools_0.2-18      splines_4.1.3         polyclip_1.10-0
 [16] IRkernel_1.3          jsonlite_1.8.0        ica_1.0-2
 [19] cluster_2.1.3         png_0.1-7             rgeos_0.5-9
 [22] uwot_0.1.11           shiny_1.7.1           spatstat.sparse_2.1-1
 [25] compiler_4.1.3        httr_1.4.3            assertthat_0.2.1
 [28] Matrix_1.4-1          fastmap_1.1.0         lazyeval_0.2.2
 [31] cli_3.3.0             later_1.3.0           htmltools_0.5.2
 [34] tools_4.1.3           igraph_1.3.1          gtable_0.3.0
 [37] glue_1.6.2            RANN_2.6.1            reshape2_1.4.4
 [40] dplyr_1.0.9           Rcpp_1.0.8.3          scattermore_0.8
 [43] vctrs_0.4.1           nlme_3.1-157          progressr_0.10.0
 [46] lmtest_0.9-40         spatstat.random_2.2-0 stringr_1.4.0
 [49] globals_0.15.0        mime_0.12             miniUI_0.1.1.1
 [52] lifecycle_1.0.1       irlba_2.3.5           goftest_1.2-3
 [55] future_1.25.0         MASS_7.3-57           zoo_1.8-10
 [58] scales_1.2.0          spatstat.core_2.4-4   promises_1.2.0.1
 [61] spatstat.utils_2.3-1  parallel_4.1.3        RColorBrewer_1.1-3
 [64] reticulate_1.25       pbapply_1.5-0         gridExtra_2.3
 [67] rpart_4.1.16          stringi_1.7.6         repr_1.1.4
 [70] rlang_1.0.2           pkgconfig_2.0.3       matrixStats_0.62.0
 [73] evaluate_0.15         lattice_0.20-45       ROCR_1.0-11
 [76] purrr_0.3.4           tensor_1.5            patchwork_1.1.1
 [79] htmlwidgets_1.5.4     cowplot_1.1.1         tidyselect_1.1.2
 [82] parallelly_1.31.1     RcppAnnoy_0.0.19      plyr_1.8.7
 [85] magrittr_2.0.3        R6_2.5.1              generics_0.1.2
 [88] pbdZMQ_0.3-7          DBI_1.1.2             pillar_1.7.0
 [91] withr_2.5.0           mgcv_1.8-40           fitdistrplus_1.1-8
 [94] survival_3.3-1        abind_1.4-5           tibble_3.1.7
 [97] future.apply_1.9.0    crayon_1.5.1          uuid_1.1-0
[100] KernSmooth_2.23-20    utf8_1.2.2            spatstat.geom_2.4-0
[103] plotly_4.10.0         grid_4.1.3            data.table_1.14.2
[106] digest_0.6.29         xtable_1.8-4          tidyr_1.2.0
[109] httpuv_1.6.5          munsell_0.5.0         viridisLite_0.4.0