satijalab / seurat

R toolkit for single cell genomics
http://www.satijalab.org/seurat
Other
2.19k stars 893 forks source link

SketchData() function error " function 'as_cholmod_sparse' not provided by package 'Matrix' " #9079

Closed CDdumortier closed 1 week ago

CDdumortier commented 2 weeks ago

Hello,

I'm actually using Seurat v5 and trying to follow the tutorial "Sketch integration using a 1 million cell dataset from Parse Biosciences" (https://satijalab.org/seurat/articles/parsebio_sketch_integration) and apply him on my seurat object which have a size of more than 1.3 millions of cells with 24 different counts layers from different studies. But it systematically gave me an error.

There is my code:

MetaAtlas        <- readRDS("XXX/MetaAtlas.UncoVer.init.obj.rds")

MetaAtlas
An object of class Seurat
58386 features across 1324192 samples within 1 assay
Active assay: RNA (58386 features, 0 variable features)
 24 layers present: counts.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1, counts.2, counts.2.1, counts.2.1.1, counts.2.1.1.1, counts.2.1.1.1.1, counts.2.1.1.1.1.1, counts.2.1.1.1.1.1.1, counts.2.1.1.1.1.1.1.1, counts.2.1.1.1.1.1.1.1.1, counts.2.1.1.1.1.1.1.1.1.1, counts.2.1.1.1.1.1.1.1.1.1.1, counts.2.1.1.1.1.1.1.1.1.1.1.1, counts.2.1.1.1.1.1.1.1.1.1.1.1.1, counts.2.1.1.1.1.1.1.1.1.1.1.1.1.1, counts.2.1.1.1.1.1.1.1.1.1.1.1.1.1.1, counts.2.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1, counts.2.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1, counts.2.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1, counts.2.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1, counts.2.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1, counts.2.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1, counts.2.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1, counts.2.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1

MetaAtlas  <- NormalizeData(MetaAtlas, verbose = F)
MetaAtlas  <- FindVariableFeatures(MetaAtlas, verbose = F)

# split object
# MetaAtlas[["RNA"]] <- split(MetaAtlas[["RNA"]], f = MetaAtlas$article.ID) # object already split in differents layers

MetaAtlas <- SketchData(object = MetaAtlas, ncells = 5000, method = "LeverageScore", sketched.assay = "sketch")
Calcuating Leverage Score
Error in irlba(A = object, nv = 50, nu = 0, verbose = FALSE) : 
  function 'as_cholmod_sparse' not provided by package 'Matrix'

for my session:

sessionInfo():

R version 4.3.3 (2024-02-29)
Platform: x86_64-conda-linux-gnu (64-bit)
Running under: CentOS Linux 7 (Core)

Matrix products: default
BLAS/LAPACK: XXX;  LAPACK version 3.12.0

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
 [9] LC_ADDRESS=C               LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

time zone: Europe/Paris
tzcode source: system (glibc)

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] Seurat_5.0.3       SeuratObject_5.0.1 sp_2.1-3

loaded via a namespace (and not attached):
  [1] deldir_2.0-4           pbapply_1.7-2          gridExtra_2.3
  [4] rlang_1.1.3            magrittr_2.0.3         RcppAnnoy_0.0.22
  [7] matrixStats_1.3.0      ggridges_0.5.6         compiler_4.3.3
 [10] spatstat.geom_3.2-9    png_0.1-8              vctrs_0.6.5
 [13] reshape2_1.4.4         stringr_1.5.1          pkgconfig_2.0.3
 [16] fastmap_1.1.1          utf8_1.2.4             promises_1.3.0
 [19] purrr_1.0.2            jsonlite_1.8.8         goftest_1.2-3
 [22] later_1.3.2            spatstat.utils_3.0-4   irlba_2.3.5.1
 [25] parallel_4.3.3         cluster_2.1.6          R6_2.5.1
 [28] ica_1.0-3              stringi_1.8.3          RColorBrewer_1.1-3
 [31] spatstat.data_3.0-4    reticulate_1.36.0      parallelly_1.37.1
 [34] lmtest_0.9-40          scattermore_1.2        Rcpp_1.0.11.6
 [37] tensor_1.5             future.apply_1.11.2    zoo_1.8-12
 [40] sctransform_0.4.1      httpuv_1.6.15          Matrix_1.6-5
 [43] splines_4.3.3          igraph_2.0.2           tidyselect_1.2.0
 [46] abind_1.4-5            spatstat.random_3.2-3  codetools_0.2-20
 [49] miniUI_0.1.1.1         spatstat.explore_3.2-6 listenv_0.9.1
 [52] lattice_0.22-6         tibble_3.2.1           plyr_1.8.9
 [55] shiny_1.8.1.1          ROCR_1.0-11            Rtsne_0.17
 [58] future_1.33.2          fastDummies_1.7.3      survival_3.5-8
 [61] polyclip_1.10-6        fitdistrplus_1.1-11    pillar_1.9.0
 [64] KernSmooth_2.23-22     plotly_4.10.4          generics_0.1.3
 [67] RcppHNSW_0.6.0         ggplot2_3.5.0          munsell_0.5.1
 [70] scales_1.3.0           globals_0.16.3         xtable_1.8-4
 [73] glue_1.7.0             lazyeval_0.2.2         tools_4.3.3
 [76] data.table_1.15.2      RSpectra_0.16-1        RANN_2.6.1
 [79] leiden_0.4.3.1         dotCall64_1.1-1        cowplot_1.1.3
 [82] grid_4.3.3             tidyr_1.3.1            colorspace_2.1-0
 [85] nlme_3.1-164           patchwork_1.2.0        cli_3.6.2
 [88] spatstat.sparse_3.0-3  spam_2.10-0            fansi_1.0.6
 [91] viridisLite_0.4.2      dplyr_1.1.4            uwot_0.1.16
 [94] gtable_0.3.4           digest_0.6.35          progressr_0.14.0
 [97] ggrepel_0.9.5          htmlwidgets_1.6.4      htmltools_0.5.8.1
[100] lifecycle_1.0.4        httr_1.4.7             mime_0.12
[103] MASS_7.3-60

I've already tried to update, install/uninstall Matrix, irlba, and RSpectra packages, but it didn't succeed.

Moreover, it's in the context of an internship in the first year of Master's degree. So I'm working with a conda environment that I share with colleagues, and when I've tried this method and I've broken two different environments, I am now "forbidden" to install or update packages as I wish.

If someone has a functional alternative.

nigiord commented 2 weeks ago

Hi @CDdumortier , unfortunately this is a bug that is present in the Matrix (and subsequent) packages that are currently compiled in the bioconda repository. It’s due to a recent change in Matrix on October 2023 (https://stat.ethz.ch/pipermail/r-sig-mac/2023-October/014890.html) combined with the way R-packages compilation works. It breaks multiple packages that rely on Matrix (the standard for dealing with sparse matrixes in R). Examples:

Currently, the only solution is to recompile Matrix and the subsequent packages (irlba, TFBSTools, etc) by using the command install.packages("packagename", type = "source", force = TRUE). Unfortunately as you noticed, this often have unintended consequences inside conda environments and can break them. On my side I managed to fix it by using the following command when I install something:

install.packages(
  "https://cran.r-project.org/src/contrib/Archive/Matrix/Matrix_1.6-5.tar.gz",  # path to exact version of Matrix I already have
  repos = NULL,
  type = "source",
  lib = "path/to/condaenv/library",  # directory with conda environment, visible with .libPaths()
  INSTALL_opts = c('--no-lock')
)

If you’re not allowed to install anything, the only solution is probably to clone the shared conda environment to your user space, and experiment on it with the command above, leaving the shared environment untouched.

I’m note sure if all the conda packages that rely on Matrix will be fixed at some point, but currently it’s really a pain with workflow managers that rely on conda for reproducibility (snakemake for instance).

mhkowalski commented 1 week ago

Thanks for your comprehensive answer @nigiord. You will have to re-install Matrix and its dependencies from source to fix this error.