satijalab / seurat

R toolkit for single cell genomics
http://www.satijalab.org/seurat
Other
2.27k stars 910 forks source link

Converted H5AD file into H5seurat file; how to load into seurat when the sparse matrix is so large? #7283

Closed jcdaneshmand closed 1 year ago

jcdaneshmand commented 1 year ago

Hello, I am having a very similar issue to this one here, which seems to still be generally unresolved. https://github.com/satijalab/seurat/issues/6870

Basically, I have a very large h5ad file, converted into an h5Seurat file, and I can't seem to load it into a seurat object due to the size of the sparse matrix.

I've tried sceasy, as well as scanpy and reticulate to try to read the h5ad and recreate a sparse matrix, but run into these errors.

sceasy::convertFormat(h5ad, from="anndata", to="seurat", outFile='GBMap.rds') Error in h(simpleError(msg, call)) : error in evaluating the argument 'x' in selecting a method for function 't': negative length vectors are not allowed

sc <- import('scanpy', convert = FALSE) numpy <- import('numpy', convert = FALSE) anndata<- import('anndata', convert = FALSE) scipy<- import('scipy', convert = FALSE) adata <- py_to_r(sc$read_h5ad(h5ad)) mat <- as.sparse(adata$X) Error in py_ref_to_r(x) : negative length vectors are not allowed

This is how it's supposed to work, loading with SeuratDisk

Convert(h5ad, dest = H5seurat_filename, overwrite = TRUE, verbose = TRUE) The conversion succeeds GBmap_Seurat <- SeuratDisk::LoadH5Seurat(file = h5seurat) but then cannot load h5Seurat Validating h5Seurat file Initializing RNA with data Error in sparseMatrix(i = x[["indices"]][] + 1, p = x[["indptr"]][], x = x[["data"]][], : 'p' must be a nondecreasing vector c(0, ...) In addition: Warning message: In sparseMatrix(i = x[["indices"]][] + 1, p = x[["indptr"]][], x = x[["data"]][], : NAs introduced by coercion to integer range

Even if I connect to h5Seurat on disk and get data for sparse matrix Gbmap_seurat_connection <- Connect(filename = h5seurat) x<-Gbmap_seurat_connection[["assays/RNA/data"]] sp<-sparseMatrix(i=x[["indices"]][]+1,p=x[["indptr"]][],x=x[["data"]][] )

Error in sparseMatrix(i = x[["indices"]][] + 1, p = x[["indptr"]][], x = x[["data"]][]) : 'p' must be a nondecreasing vector c(0, ...) In addition: Warning message: In sparseMatrix(i = x[["indices"]][] + 1, p = x[["indptr"]][], x = x[["data"]][]) : NAs introduced by coercion to integer range

For reference, the dataset I am trying to load is the extended GBmap, found here: https://cellxgene.cziscience.com/collections/999f2a15-3d7e-440b-96ae-2c806799c08c

Am I going about the right way of loading this very large dataset into a Seurat object?

yuhanH commented 1 year ago

hi @jcdaneshmand For this large h5ad file, you can use BPCells to load the matrix and create a seurat v5 object. Here is the interaction vignette you can follow: https://satijalab.org/seurat/articles/seurat5_bpcells_interaction_vignette.html

winner0809 commented 3 months ago

@yuhanH i used BPCells, but I got error after running SCT and trying to merge, detail please check my issue post here: https://github.com/satijalab/seurat/issues/9111