Closed IrinaVKuznetsova closed 7 months ago
Hi, I've never seen this error, but this could be a memory and/or multithreading issue. I'd recommend to check the following:
htop
).as.SingleCellExperiment
had a bug that made the object huge (although this should be solved in the version you're using). So check the size (e.g. using format(object.size(x), units="Gb")
of both cell_bender_seurat
and sce
. If you see that sce
is much bigger, you can always skip the conversion and run scDblFinder with something like:
sce <- scDblFinder(GetAssayData(cell_bender_seurat, slot="counts", assay="RNA3"),
samples=cell_bender_seurat$orig.ident)
htop
it does seem to be memory-related, try reducing the number of threads (or eventually using a single one).thank you for the prompt response
1) It looks normal (below 1%) 2) seems ok
format(object.size(cell_bender_seurat), units="Gb") # "8 Gb"
format(object.size(sce), units="Gb") # "2.9 Gb"
A) Could it be something to do with how Seurat v.5 has layers ( 8 sample 8 layers for counts for example), and when I convert it to Array v.3 it becomes one matrix 36601 x 75331?
B) Tried to run without threads:
sce.standard <- scDblFinder(sce, samples = "orig.ident")
Warning messages:
1: In rpois(nrow(x) * length(wAd), as.numeric(as.matrix(x[, wAd]))) :
NAs produced
2: In value[[3L]](cond) :
Error in calculating norm factors:Error in .local(x, ...): size factors should be positive
C) Tried this too
sce <- scDblFinder(GetAssayData(cell_bender_seurat, slot="counts", assay="RNA3"),
samples=cell_bender_seurat$orig.ident)
Error in .checkSCE(sce) :
`sce` should be a SingleCellExperiment, a SummarizedExperiment, or an array (i.e. matrix, sparse matric, etc.) of counts.
In addition: Warning message:
The `slot` argument of `GetAssayData()` is deprecated as of SeuratObject 5.0.0.
ℹ Please use the `layer` argument instead.
This warning is displayed once every 8 hours.
Call `lifecycle::last_lifecycle_warnings()` to see where this warning was generated.
Not sure I understand your question A, the original Seurat object also has dimensions 36601 x 75331...
class(GetAssayData(cell_bender_seurat, layer="counts", assay="RNA3"))
?quantile(colSums(counts(sce)))
sce.standard <- scDblFinder(sce[VariableFeatures(cell_bender_seurat),], samples = "orig.ident")
1)
class(GetAssayData(cell_bender_seurat, layer="counts", assay="RNA3"))
[1] "dgCMatrix"
attr(,"package")
[1] "Matrix"
2.
quantile(colSums(counts(sce)))
0% 25% 50% 75% 100%
201 650 2209 5732 81977
sce.standard <- scDblFinder(sce[VariableFeatures(cell_bender_seurat),], samples = "orig.ident")
I'm unsure what's the issue here, but it appears to be related to 1) the fact that you have cells with a very low library size (your 201 is crap, personally I'd have filtered out many) and 2) the feature selection internal to scDblFinder might have resulted in some cells not having reads in those features. This appears to have been solved by using the VariableFeatures (which is a perfectly decent way of doing things), or would most likely also be solved by filtering out cells with a low library size (e.g. taking >=400-500).
If you want you can try again with multithreading, user either of these 2 solutions.
how long in average does it take to run scDblFinder ?
1) its been ~5 hrs 2) filtered out data, which eventually crashed
quantile(colSums(counts(sce)))
0% 25% 50% 75% 100%
451 1189 3332 6480 81977
sce.standard <- scDblFinder(sce, samples = "orig.ident", BPPARAM=MulticoreParam(8))
Warning in (function (A, nv = 5, nu = nv, maxit = 1000, work = nv + 7, reorth = TRUE, :
convergence criterion below machine epsilon
Warning in (function (A, nv = 5, nu = nv, maxit = 1000, work = nv + 7, reorth = TRUE, :
did not converge--results might be invalid!; try increasing work or maxit
Warning in (function (A, nv = 5, nu = nv, maxit = 1000, work = nv + 7, reorth = TRUE, :
convergence criterion below machine epsilon
Warning in (function (A, nv = 5, nu = nv, maxit = 1000, work = nv + 7, reorth = TRUE, :
did not converge--results might be invalid!; try increasing work or maxit
Stop worker failed with the error: wrong args for environment subassignment
I figure out why I was getting that error, few steps back in my analysis:
I removed ambient RNA with Cell Bender v3, which generated negative values in the count matrix, that's why scDblFinder() was not able to process my data. The issue about cell bender generating a negative count matrix is discussed here htps://github.com/broadinstitute/CellBender/issues/306. To fix it run Cellbender v.2 re-run scDblFinder()
all works, quite quickly Cheers.
Hi, Great that we have an explanation, thanks for coming back on this. I've now added in the devel version a check of that so that a more useful error message is provided. Best, Pierre-Luc
Dear scDblFinder developer,
This is a first time I am trying to use your tool. Unfortunately , I am getting an error and not sure how to fix it: Running on Linux, Ubuntu with 250 RAM, CPU: 64, 3T free space
I'd appreciate any suggestions. Thank you
sessionInfo()
R version 4.3.2 (2023-10-31) Platform: x86_64-conda-linux-gnu (64-bit) Running under: Ubuntu 20.04.6 LTS
Matrix products: default BLAS/LAPACK: /data/bin/conda_env_location/PDX_manuscript_2023_v2/lib/libopenblasp-r0.3.26.so; LAPACK version 3.12.0
locale: [1] LC_CTYPE=C.UTF-8 LC_NUMERIC=C LC_TIME=C.UTF-8 [4] LC_COLLATE=C.UTF-8 LC_MONETARY=C.UTF-8 LC_MESSAGES=C.UTF-8 [7] LC_PAPER=C.UTF-8 LC_NAME=C LC_ADDRESS=C [10] LC_TELEPHONE=C LC_MEASUREMENT=C.UTF-8 LC_IDENTIFICATION=C
time zone: Etc/UTC tzcode source: system (glibc)
attached base packages: [1] stats4 stats graphics grDevices utils datasets methods [8] base
other attached packages: [1] BiocParallel_1.36.0 scDblFinder_1.16.0 [3] SingleCellExperiment_1.24.0 SummarizedExperiment_1.32.0 [5] Biobase_2.62.0 GenomicRanges_1.54.1 [7] GenomeInfoDb_1.38.1 IRanges_2.36.0 [9] S4Vectors_0.40.2 BiocGenerics_0.48.1 [11] MatrixGenerics_1.14.0 matrixStats_1.2.0 [13] Seurat_5.0.1 SeuratObject_5.0.0 [15] sp_2.1-3
loaded via a namespace (and not attached): [1] RcppAnnoy_0.0.22 splines_4.3.2 [3] later_1.3.2 BiocIO_1.12.0 [5] bitops_1.0-7 tibble_3.2.1 [7] polyclip_1.10-6 XML_3.99-0.16.1 [9] fastDummies_1.7.3 lifecycle_1.0.4 [11] edgeR_4.0.2 globals_0.16.2 [13] lattice_0.22-5 MASS_7.3-60 [15] magrittr_2.0.3 limma_3.58.1 [17] plotly_4.10.4 yaml_2.3.8 [19] metapod_1.10.0 httpuv_1.6.14 [21] sctransform_0.4.1 spam_2.10-0 [23] spatstat.sparse_3.0-3 reticulate_1.35.0 [25] cowplot_1.1.3 pbapply_1.7-2 [27] RColorBrewer_1.1-3 abind_1.4-5 [29] zlibbioc_1.48.0 Rtsne_0.17 [31] purrr_1.0.2 RCurl_1.98-1.14 [33] GenomeInfoDbData_1.2.11 ggrepel_0.9.5 [35] irlba_2.3.5.1 listenv_0.9.1 [37] spatstat.utils_3.0-4 goftest_1.2-3 [39] RSpectra_0.16-1 dqrng_0.3.2 [41] spatstat.random_3.2-2 fitdistrplus_1.1-11 [43] parallelly_1.36.0 DelayedMatrixStats_1.24.0 [45] leiden_0.4.3.1 codetools_0.2-19 [47] DelayedArray_0.28.0 scuttle_1.12.0 [49] tidyselect_1.2.0 ScaledMatrix_1.10.0 [51] viridis_0.6.5 spatstat.explore_3.2-6 [53] GenomicAlignments_1.38.0 jsonlite_1.8.8 [55] BiocNeighbors_1.20.0 ellipsis_0.3.2 [57] progressr_0.14.0 ggridges_0.5.6 [59] survival_3.5-7 scater_1.30.1 [61] tools_4.3.2 ica_1.0-3 [63] Rcpp_1.0.12 glue_1.7.0 [65] gridExtra_2.3 SparseArray_1.2.2 [67] dplyr_1.1.4 fastmap_1.1.1 [69] bluster_1.12.0 fansi_1.0.6 [71] digest_0.6.34 rsvd_1.0.5 [73] R6_2.5.1 mime_0.12 [75] colorspace_2.1-0 scattermore_1.2 [77] tensor_1.5 spatstat.data_3.0-4 [79] utf8_1.2.4 tidyr_1.3.1 [81] generics_0.1.3 data.table_1.14.10 [83] rtracklayer_1.62.0 httr_1.4.7 [85] htmlwidgets_1.6.4 S4Arrays_1.2.0 [87] uwot_0.1.16 pkgconfig_2.0.3 [89] gtable_0.3.4 lmtest_0.9-40 [91] XVector_0.42.0 htmltools_0.5.7 [93] dotCall64_1.1-1 scales_1.3.0 [95] png_0.1-8 scran_1.30.0 [97] reshape2_1.4.4 rjson_0.2.21 [99] nlme_3.1-164 zoo_1.8-12 [101] stringr_1.5.1 KernSmooth_2.23-22 [103] parallel_4.3.2 miniUI_0.1.1.1 [105] vipor_0.4.7 restfulr_0.0.15 [107] pillar_1.9.0 grid_4.3.2 [109] vctrs_0.6.5 RANN_2.6.1 [111] promises_1.2.1 BiocSingular_1.18.0 [113] beachmat_2.18.0 xtable_1.8-4 [115] cluster_2.1.6 beeswarm_0.4.0 [117] locfit_1.5-9.8 cli_3.6.2 [119] compiler_4.3.2 Rsamtools_2.18.0 [121] rlang_1.1.3 crayon_1.5.2 [123] future.apply_1.11.1 plyr_1.8.9 [125] ggbeeswarm_0.7.2 stringi_1.8.3 [127] viridisLite_0.4.2 deldir_2.0-2 [129] munsell_0.5.0 Biostrings_2.70.1 [131] lazyeval_0.2.2 spatstat.geom_3.2-8 [133] Matrix_1.6-1.1 RcppHNSW_0.6.0 [135] patchwork_1.2.0 sparseMatrixStats_1.14.0 [137] future_1.33.1 ggplot2_3.4.4 [139] statmod_1.5.0 shiny_1.8.0 [141] ROCR_1.0-11 igraph_1.6.0 [143] xgboost_2.0.3.1