RPCA or CCA integration with IntegrateLayers fails when using on-disk BPCells matrices #7434

bumpingbell commented 1 year ago

I've been following by the "Integrative analysis in Seurat v5" vignette for dataset integration, but I store my matrix on-disk by BPCells since my dataset is a large one. When I run RPCA integration through IntegrateLayers:

options(Seurat.object.assay.version = 'v5')

obj <- LoadData("pbmcsca")
obj <- subset(obj, nFeature_RNA > 1000)

write_matrix_dir(mat = obj[["RNA"]]$counts, dir = "~/dir")
counts.mat <- open_matrix_dir(dir = "~/dir")
obj[["RNA"]]$counts <- counts.mat

obj[["RNA"]] <- split(obj[["RNA"]], f = obj$Method)

obj <- NormalizeData(obj)
obj <- FindVariableFeatures(obj)
obj <- ScaleData(obj)
obj <- RunPCA(obj)

obj <- IntegrateLayers(
  object = obj, method = RPCAIntegration,
  orig.reduction = "pca", new.reduction = "integrated.rpca", 
  verbose = TRUE

I got this error: Error in h(simpleError(msg, call)) : error in evaluating the argument 'x' in selecting a method for function 'as.matrix': requires numeric/complex matrix/vector arguments

It seems like this error occurs in the FindIntegrationAnchors step, where a function tries to run a crossprod but one of the inputs is not a numeric/complex matrix/vector. I traced the function, and I believe the faulty line of code lies in Seurat:::ProjectSVD, projected.u <- as.matrix(x = crossprod(x = vt, y = data)). The data object is an S4 object of type "RenameDims".

Here is the traceback:

14. h(simpleError(msg, call))

.handleSimpleError(function (cond) 
.Internal(C_tryCatchHelper(addr, 1L, cond)), "requires numeric/complex matrix/vector arguments", 
    base::quote(crossprod(x, y)))

12. base::crossprod(x, y)

11. crossprod(x = vt, y = data)

10. crossprod(x = vt, y = data)

9. as.matrix(x = crossprod(x = vt, y = data))

ProjectSVD(reduction = object.2[[reduction]], data = data.1, 
    mode = reduction, features = common.features, do.scale = do.scale, =, use.original.stats = FALSE, verbose = verbose)

ReciprocalProject(object.1 = object.1, object.2 = object.2, reduction = "pca", = "projectedpca", features = anchor.features, 
    do.scale = FALSE, = FALSE, slot = "", 
    l2.norm = l2.norm, verbose = verbose)

6. FUN(X[[i]], ...)

5. lapply(X[Split[[i]]], FUN, ...)

4. pblapply(X = 1:nrow(x = combinations), FUN = anchoring.fxn)

FindIntegrationAnchors(object.list = object.list, anchor.features = features, 
    scale = FALSE, reduction = "rpca", normalization.method = normalization.method, 
    dims = dims, k.filter = k.filter, reference = reference, 
    verbose = verbose, ...) at
method(object = object[[assay]], assay = assay, orig = obj.orig, 
    layers = layers, scale.layer = scale.layer, features = features, 
    groups = groups, ...)

IntegrateLayers(object = seurat_phase, method = RPCAIntegration, 
    orig.reduction = "pca", new.reduction = "integrated.rpca", 
    verbose = TRUE)

Following up, I tried with CCAIntegration, and the error is:

Error in eval(expr, p) : 
  Not compatible with requested type: [type=S4; target=double].

My sessionInfo:

R version 4.2.3 (2023-03-15)
Platform: x86_64-conda-linux-gnu (64-bit)
Running under: Ubuntu 22.04.2 LTS

Matrix products: default
BLAS/LAPACK: /home/bumpingbell/miniconda3/lib/

 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8    LC_PAPER=en_US.UTF-8       LC_NAME=C                 

mihem commented 1 year ago

Have you tried restarting R? Not the same error, but also related to BPCells matrix plus Integrate.

I also encountered a similar problem (also only occured when I used scvi and NOT rpca), restarting also helped.

Also Gesmira said that this error should be fixed in the seuratv5 branch, so maybe try updating:

remotes::install_github("satijalab/seurat", "seurat5", quiet = TRUE)
Gesmira commented 1 year ago

Hi @bumpingbell, Thank you for pointing out this bug! This has been fixed in the newest version of Seurat v5! Please clear and restart your R session and install using:

remotes::install_github("mojaveazure/seurat-object", "seurat5", quiet = TRUE)
remotes::install_github("satijalab/seurat", "seurat5", quiet = TRUE)
ZxZhou4150 commented 1 month ago

Hi Team,

I still met the same issue about CCA when running an intermediate step of "Celltrek" with some data storing with BPCells. The problem seems to occur in Seurat::FindTransferAnchors. My codes:

# link to count matrix
counts.mat <- BPCells::open_matrix_dir(dir = "/virtual_path_to_BPCell_storage")
adata_ref[["RNA"]]$counts <- counts.mat

DefaultAssay(adata_ref) <- "RNA"
adata_ref_conv = adata_ref
adata_ref_conv$orig.ident <- adata_ref_conv$v2.subclass.l1
adata_ref_conv[["ATAC"]] <- NULL
adata_ref_conv = NormalizeData(adata_ref_conv)

adata_vis = readRDS("/virtual_path_to_spatial_data")

traint <- my_traint(st_data=adata_vis, sc_data=adata_ref_conv, sc_assay='RNA', cell_names='v2.subclass.l1') #error occurs here

The error: Error: Not compatible with requested type: [type=S4; target=double].


8: stop(structure(list(message = "Not compatible with requested type: [type=S4; target=double].", 
       call = NULL, cppstack = NULL), class = c("Rcpp::not_compatible", 
   "C++Error", "error", "condition")))
7: Standardize(mat = object1, display_progress = FALSE)
6: RunCCA.default(object1 = data1, object2 = data2, standardize = TRUE, =, verbose = verbose, )
5: RunCCA(object1 = data1, object2 = data2, standardize = TRUE, =, verbose = verbose, )
4: RunCCA.Seurat(object1 = reference, object2 = query, features = features, = max(dims), renormalize = FALSE, rescale = FALSE, 
       verbose = verbose)
3: RunCCA(object1 = reference, object2 = query, features = features, = max(dims), renormalize = FALSE, rescale = FALSE, 
       verbose = verbose)
2: Seurat::FindTransferAnchors(reference = sc_data, query = st_data, 
       reference.assay = sc_assay, query.assay = st_assay, normalization.method = norm, 
       features = sc_st_features, reduction = "cca", ...) at celltrek_functions.R#30
1: my_traint(st_data = adata_vis, sc_data = adata_ref_conv, sc_assay = "RNA", 
       cell_names = "v2.subclass.l1")

I believe I'm using the a newer version of Seurat which should have fixed this problem. My session info:

R version 4.3.1 (2023-06-16)
Platform: x86_64-conda-linux-gnu (64-bit)
Running under: Red Hat Enterprise Linux 9.4 (Plow)

Matrix products: default
BLAS/LAPACK: /data/home/zz5708/miniconda3/envs/R/lib/;  LAPACK version 3.9.0

 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8       
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                  LC_ADDRESS=C              

time zone: America/New_York
tzcode source: system (glibc)

Could you please take a look to see what the problem might be? Thanks!

Also, the code remotes::install_github("mojaveazure/seurat-object", "seurat5", quiet = TRUE) seems not to be correct.

> remotes::install_github("mojaveazure/seurat-object", "seurat5", quiet = TRUE)
Error: Failed to install 'unknown package' from GitHub:
  HTTP error 404.
  No commit found for the ref seurat5

  Did you spell the repo owner and repo name correctly?
  - If spelling is correct, check that you have the required permissions to access the repo.