Open CianMurphy opened 6 months ago
Hey @CianMurphy,
the reason you see this error is due to a limitation on the size of sparse matrices forced by dgCMatrix
, which is the default sparse matrix class used by Seurat. See (for example) https://github.com/satijalab/seurat/issues/4380 for more details. The dataset you're trying to query is large enough to hit that limit and would fail a Seurat conversion even outside the Census.
This limitation can be removed by using Seurat v5, since it allows to use a sparse matrix class that is not dgCMatrix
. Currently Seurat v5 isn't supported by the Census or TileDB-SOMA, which is the backend library used by the Census, although it's on the roadmap. We will publish an update when it will be available. In the meanwhile, those large datasets/slices can be analyzed with Python, which doesn't have this limitation.
Describe the bug
When trying to create a seurat object I get the error message:
Error in vec_to_Array(x, type) : long vectors not supported yet: memory.c:3888 Calls: get_seurat ... -> -> -> vec_to_Array
Execution halted
This is despite running the script on a cluster with 650GB memory.
To Reproduce
library("cellxgene.census") library("Seurat") library(data.table)
census_dat <- 'census_datasets.csv'
census <- open_soma() seurat <- get_seurat( census, organism = "Homo sapiens", obs_value_filter = "dataset_id == '9f222629-9e39-47d0-b83f-e08d610c7479'" )
Environment
R version 4.3.2 (2023-10-31) Platform: x86_64-conda-linux-gnu (64-bit) Running under: CentOS Linux 7 (Core)
R is installed via mamba and Anaconda3/2023.03