satijalab / seurat

R toolkit for single cell genomics
http://www.satijalab.org/seurat
Other
2.31k stars 920 forks source link

RPCA or CCA integration with IntegrateLayers fails when using on-disk BPCells matrices #7434

Closed bumpingbell closed 1 year ago

bumpingbell commented 1 year ago

I've been following by the "Integrative analysis in Seurat v5" vignette for dataset integration, but I store my matrix on-disk by BPCells since my dataset is a large one. When I run RPCA integration through IntegrateLayers:

library(Seurat)
options(Seurat.object.assay.version = 'v5')
library(SeuratData)
library(BPCells)

obj <- LoadData("pbmcsca")
obj <- subset(obj, nFeature_RNA > 1000)

write_matrix_dir(mat = obj[["RNA"]]$counts, dir = "~/dir")
counts.mat <- open_matrix_dir(dir = "~/dir")
obj[["RNA"]]$counts <- counts.mat

obj[["RNA"]] <- split(obj[["RNA"]], f = obj$Method)

obj <- NormalizeData(obj)
obj <- FindVariableFeatures(obj)
obj <- ScaleData(obj)
obj <- RunPCA(obj)

obj <- IntegrateLayers(
  object = obj, method = RPCAIntegration,
  orig.reduction = "pca", new.reduction = "integrated.rpca", 
  verbose = TRUE
)

I got this error: Error in h(simpleError(msg, call)) : error in evaluating the argument 'x' in selecting a method for function 'as.matrix': requires numeric/complex matrix/vector arguments

It seems like this error occurs in the FindIntegrationAnchors step, where a function tries to run a crossprod but one of the inputs is not a numeric/complex matrix/vector. I traced the function, and I believe the faulty line of code lies in Seurat:::ProjectSVD, projected.u <- as.matrix(x = crossprod(x = vt, y = data)). The data object is an S4 object of type "RenameDims".

Here is the traceback:

14. h(simpleError(msg, call))

13.
.handleSimpleError(function (cond) 
.Internal(C_tryCatchHelper(addr, 1L, cond)), "requires numeric/complex matrix/vector arguments", 
    base::quote(crossprod(x, y)))

12. base::crossprod(x, y)

11. crossprod(x = vt, y = data)

10. crossprod(x = vt, y = data)

9. as.matrix(x = crossprod(x = vt, y = data))

8.
ProjectSVD(reduction = object.2[[reduction]], data = data.1, 
    mode = reduction, features = common.features, do.scale = do.scale, 
    do.center = do.center, use.original.stats = FALSE, verbose = verbose)

7.
ReciprocalProject(object.1 = object.1, object.2 = object.2, reduction = "pca", 
    projected.name = "projectedpca", features = anchor.features, 
    do.scale = FALSE, do.center = FALSE, slot = "scale.data", 
    l2.norm = l2.norm, verbose = verbose)

6. FUN(X[[i]], ...)

5. lapply(X[Split[[i]]], FUN, ...)

4. pblapply(X = 1:nrow(x = combinations), FUN = anchoring.fxn)

3.
FindIntegrationAnchors(object.list = object.list, anchor.features = features, 
    scale = FALSE, reduction = "rpca", normalization.method = normalization.method, 
    dims = dims, k.filter = k.filter, reference = reference, 
    verbose = verbose, ...) at
 <tmp>#42
2.
method(object = object[[assay]], assay = assay, orig = obj.orig, 
    layers = layers, scale.layer = scale.layer, features = features, 
    groups = groups, ...)

1.
IntegrateLayers(object = seurat_phase, method = RPCAIntegration, 
    orig.reduction = "pca", new.reduction = "integrated.rpca", 
    verbose = TRUE)

Following up, I tried with CCAIntegration, and the error is:

Error in eval(expr, p) : 
  Not compatible with requested type: [type=S4; target=double].

My sessionInfo:

R version 4.2.3 (2023-03-15)
Platform: x86_64-conda-linux-gnu (64-bit)
Running under: Ubuntu 22.04.2 LTS

Matrix products: default
BLAS/LAPACK: /home/bumpingbell/miniconda3/lib/libopenblasp-r0.3.21.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8    LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C             LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] pbmcsca.SeuratData_3.0.0 lungref.SeuratData_2.0.0 SeuratData_0.2.2.9001    Seurat_4.9.9.9045       
[5] SeuratObject_4.9.9.9084  sp_1.6-1                 shinyBS_0.61.1           BPCells_0.1.0           

loaded via a namespace (and not attached):
  [1] rappdirs_0.3.3                    rtracklayer_1.58.0                scattermore_1.1                  
  [4] R.methodsS3_1.8.2                 tidyr_1.3.0                       knitr_1.43                       
  [7] JASPAR2020_0.99.10                ggplot2_3.4.2                     bit64_4.0.5                      
 [10] irlba_2.3.5.1                     DelayedArray_0.24.0               R.utils_2.12.2                   
 [13] data.table_1.14.8                 KEGGREST_1.38.0                   TFBSTools_1.36.0                 
 [16] RCurl_1.98-1.12                   AnnotationFilter_1.22.0           generics_0.1.3                   
 [19] BiocGenerics_0.44.0               GenomicFeatures_1.50.4            cowplot_1.1.1                    
 [22] RSQLite_2.3.1                     RANN_2.6.1                        future_1.32.0                    
 [25] bit_4.0.5                         tzdb_0.4.0                        spatstat.data_3.0-1              
 [28] xml2_1.3.4                        httpuv_1.6.11                     SummarizedExperiment_1.28.0      
 [31] DirichletMultinomial_1.40.0       gargle_1.4.0                      xfun_0.39                        
 [34] hms_1.1.3                         evaluate_0.21                     promises_1.2.0.1                 
 [37] fansi_1.0.4                       restfulr_0.0.15                   progress_1.2.2                   
 [40] caTools_1.18.2                    dbplyr_2.3.2                      igraph_1.4.3                     
 [43] DBI_1.1.3                         htmlwidgets_1.6.2                 spatstat.geom_3.2-1              
 [46] googledrive_2.1.0                 stats4_4.2.3                      purrr_1.0.1                      
 [49] ellipsis_0.3.2                    RSpectra_0.16-1                   dplyr_1.1.2                      
 [52] annotate_1.76.0                   biomaRt_2.54.1                    deldir_1.0-9                     
 [55] MatrixGenerics_1.10.0             vctrs_0.6.2                       Biobase_2.58.0                   
 [58] SeuratDisk_0.0.0.9020             ensembldb_2.22.0                  ROCR_1.0-11                      
 [61] abind_1.4-5                       cachem_1.0.8                      withr_2.5.0                      
 [64] BSgenome.Hsapiens.UCSC.hg38_1.4.5 BSgenome_1.66.3                   progressr_0.13.0                 
 [67] presto_1.0.0                      sctransform_0.3.5                 GenomicAlignments_1.34.1         
 [70] prettyunits_1.1.1                 goftest_1.2-3                     cluster_2.1.4                    
 [73] dotCall64_1.0-2                   lazyeval_0.2.2                    seqLogo_1.64.0                   
 [76] crayon_1.5.2                      hdf5r_1.3.8                       spatstat.explore_3.2-1           
 [79] pkgconfig_2.0.3                   GenomeInfoDb_1.34.9               nlme_3.1-162                     
 [82] ProtGenerics_1.30.0               rlang_1.1.1                       globals_0.16.2                   
 [85] lifecycle_1.0.3                   miniUI_0.1.1.1                    filelock_1.0.2                   
 [88] fastDummies_1.6.3                 BiocFileCache_2.6.1               cellranger_1.1.0                 
 [91] polyclip_1.10-4                   RcppHNSW_0.4.1                    matrixStats_1.0.0                
 [94] lmtest_0.9-40                     Matrix_1.5-4.1                    Rhdf5lib_1.20.0                  
 [97] zoo_1.8-12                        ggridges_0.5.4                    googlesheets4_1.1.0              
[100] png_0.1-8                         viridisLite_0.4.2                 rjson_0.2.21                     
[103] shinydashboard_0.7.2              bitops_1.0-7                      R.oo_1.25.0                      
[106] KernSmooth_2.23-21                spam_2.9-1                        rhdf5filters_1.10.1              
[109] Biostrings_2.66.0                 blob_1.2.4                        stringr_1.5.0                    
[112] parallelly_1.36.0                 spatstat.random_3.1-5             readr_2.1.4                      
[115] S4Vectors_0.36.2                  CNEr_1.34.0                       scales_1.2.1                     
[118] memoise_2.0.1                     magrittr_2.0.3                    plyr_1.8.8                       
[121] ica_1.0-3                         zlibbioc_1.44.0                   compiler_4.2.3                   
[124] BiocIO_1.8.0                      RColorBrewer_1.1-3                fitdistrplus_1.1-11              
[127] Rsamtools_2.14.0                  cli_3.6.1                         XVector_0.38.0                   
[130] listenv_0.9.0                     patchwork_1.1.2                   pbapply_1.7-0                    
[133] MASS_7.3-60                       tidyselect_1.2.0                  stringi_1.7.12                   
[136] yaml_2.3.7                        ggrepel_0.9.3                     grid_4.2.3                       
[139] fastmatch_1.1-3                   EnsDb.Hsapiens.v86_2.99.0         tools_4.2.3                      
[142] future.apply_1.11.0               parallel_4.2.3                    rstudioapi_0.14                  
[145] TFMPvalue_0.0.9                   gridExtra_2.3                     Rtsne_0.16                       
[148] digest_0.6.31                     shiny_1.7.4                       pracma_2.4.2                     
[151] Rcpp_1.0.10                       Azimuth_0.4.6.9004                GenomicRanges_1.50.2             
[154] later_1.3.1                       RcppAnnoy_0.0.20                  httr_1.4.6                       
[157] AnnotationDbi_1.60.2              colorspace_2.1-0                  XML_3.99-0.14                    
[160] fs_1.6.2                          tensor_1.5                        reticulate_1.28                  
[163] IRanges_2.32.0                    splines_4.2.3                     uwot_0.1.14                      
[166] RcppRoll_0.3.0                    spatstat.utils_3.0-3              plotly_4.10.1                    
[169] xtable_1.8-4                      jsonlite_1.8.4                    poweRlaw_0.70.6                  
[172] R6_2.5.1                          pillar_1.9.0                      htmltools_0.5.5                  
[175] mime_0.12                         glue_1.6.2                        fastmap_1.1.1                    
[178] DT_0.28                           BiocParallel_1.32.6               codetools_0.2-19                 
[181] Signac_1.9.0.9000                 utf8_1.2.3                        lattice_0.21-8                   
[184] spatstat.sparse_3.0-1             tibble_3.2.1                      curl_5.0.0                       
[187] leiden_0.4.3                      gtools_3.9.4                      shinyjs_2.1.0                    
[190] GO.db_3.16.0                      survival_3.5-5                    rmarkdown_2.22                   
[193] munsell_0.5.0                     rhdf5_2.42.1                      GenomeInfoDbData_1.2.9           
[196] reshape2_1.4.4                    gtable_0.3.3  
mihem commented 1 year ago

Have you tried restarting R? Not the same error, but also related to BPCells matrix plus Integrate. https://github.com/satijalab/seurat/issues/7329

I also encountered a similar problem (also only occured when I used scvi and NOT rpca), restarting also helped. https://github.com/satijalab/seurat/issues/7373

Also Gesmira said that this error should be fixed in the seuratv5 branch, so maybe try updating:

remotes::install_github("satijalab/seurat", "seurat5", quiet = TRUE)
Gesmira commented 1 year ago

Hi @bumpingbell, Thank you for pointing out this bug! This has been fixed in the newest version of Seurat v5! Please clear and restart your R session and install using:

remotes::install_github("mojaveazure/seurat-object", "seurat5", quiet = TRUE)
remotes::install_github("satijalab/seurat", "seurat5", quiet = TRUE)
ZxZhou4150 commented 2 months ago

Hi Team,

I still met the same issue about CCA when running an intermediate step of "Celltrek" with some data storing with BPCells. The problem seems to occur in Seurat::FindTransferAnchors. My codes:

# link to count matrix
counts.mat <- BPCells::open_matrix_dir(dir = "/virtual_path_to_BPCell_storage")
adata_ref[["RNA"]]$counts <- counts.mat

DefaultAssay(adata_ref) <- "RNA"
adata_ref_conv = adata_ref
adata_ref_conv$orig.ident <- adata_ref_conv$v2.subclass.l1
adata_ref_conv[["ATAC"]] <- NULL
adata_ref_conv = NormalizeData(adata_ref_conv)

adata_vis = readRDS("/virtual_path_to_spatial_data")

traint <- my_traint(st_data=adata_vis, sc_data=adata_ref_conv, sc_assay='RNA', cell_names='v2.subclass.l1') #error occurs here

The error: Error: Not compatible with requested type: [type=S4; target=double].

Traceback:

8: stop(structure(list(message = "Not compatible with requested type: [type=S4; target=double].", 
       call = NULL, cppstack = NULL), class = c("Rcpp::not_compatible", 
   "C++Error", "error", "condition")))
7: Standardize(mat = object1, display_progress = FALSE)
6: RunCCA.default(object1 = data1, object2 = data2, standardize = TRUE, 
       num.cc = num.cc, verbose = verbose, )
5: RunCCA(object1 = data1, object2 = data2, standardize = TRUE, 
       num.cc = num.cc, verbose = verbose, )
4: RunCCA.Seurat(object1 = reference, object2 = query, features = features, 
       num.cc = max(dims), renormalize = FALSE, rescale = FALSE, 
       verbose = verbose)
3: RunCCA(object1 = reference, object2 = query, features = features, 
       num.cc = max(dims), renormalize = FALSE, rescale = FALSE, 
       verbose = verbose)
2: Seurat::FindTransferAnchors(reference = sc_data, query = st_data, 
       reference.assay = sc_assay, query.assay = st_assay, normalization.method = norm, 
       features = sc_st_features, reduction = "cca", ...) at celltrek_functions.R#30
1: my_traint(st_data = adata_vis, sc_data = adata_ref_conv, sc_assay = "RNA", 
       cell_names = "v2.subclass.l1")

I believe I'm using the a newer version of Seurat which should have fixed this problem. My session info:

R version 4.3.1 (2023-06-16)
Platform: x86_64-conda-linux-gnu (64-bit)
Running under: Red Hat Enterprise Linux 9.4 (Plow)

Matrix products: default
BLAS/LAPACK: /data/home/zz5708/miniconda3/envs/R/lib/libopenblasp-r0.3.21.so;  LAPACK version 3.9.0

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8       
 [4] LC_COLLATE=en_US.UTF-8     LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                  LC_ADDRESS=C              
[10] LC_TELEPHONE=C             LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

time zone: America/New_York
tzcode source: system (glibc)

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] BPCells_0.2.0               magrittr_2.0.3              RANN_2.6.1                 
 [4] SeuratDisk_0.0.0.9021       reticulate_1.38.0           ConsensusClusterPlus_1.66.0
 [7] viridis_0.6.5               viridisLite_0.4.2           Seurat_5.1.0               
[10] SeuratObject_5.0.99.9001    sp_2.1-4                    dplyr_1.1.4                
[13] CellTrek_0.0.94            

loaded via a namespace (and not attached):
  [1] RColorBrewer_1.1-3      rstudioapi_0.16.0       jsonlite_1.8.8         
  [4] spatstat.utils_3.0-5    zlibbioc_1.48.2         vctrs_0.6.5            
  [7] ROCR_1.0-11             spatstat.explore_3.3-1  RCurl_1.98-1.14        
 [10] rstatix_0.7.2           htmltools_0.5.8.1       dynamicTreeCut_1.63-1  
 [13] broom_1.0.6             sctransform_0.4.1       parallelly_1.37.1      
 [16] KernSmooth_2.23-24      htmlwidgets_1.6.4       ica_1.0-3              
 [19] plyr_1.8.9              plotly_4.10.4           zoo_1.8-12             
 [22] igraph_2.0.3            mime_0.12               lifecycle_1.0.4        
 [25] pkgconfig_2.0.3         Matrix_1.6-4            R6_2.5.1               
 [28] fastmap_1.2.0           magic_1.6-1             GenomeInfoDbData_1.2.11
 [31] MatrixGenerics_1.14.0   fitdistrplus_1.2-1      future_1.33.2          
 [34] shiny_1.8.1.1           digest_0.6.36           colorspace_2.1-0       
 [37] patchwork_1.2.0         S4Vectors_0.40.2        tensor_1.5             
 [40] RSpectra_0.16-1         irlba_2.3.5.1           GenomicRanges_1.54.1   
 [43] akima_0.6-3.4           ggpubr_0.6.0            philentropy_0.8.0      
 [46] progressr_0.14.0        fansi_1.0.5             spatstat.sparse_3.1-0  
 [49] httr_1.4.7              polyclip_1.10-6         abind_1.4-5            
 [52] compiler_4.3.1          withr_3.0.0             bit64_4.0.5            
 [55] backports_1.5.0         carData_3.0-5           fastDummies_1.7.3      
 [58] ggsignif_0.6.4          MASS_7.3-58             tools_4.3.1            
 [61] lmtest_0.9-40           httpuv_1.6.15           future.apply_1.11.2    
 [64] goftest_1.2-3           glue_1.7.0              dbscan_1.2-0           
 [67] DiagrammeR_1.0.11       nlme_3.1-163            promises_1.3.0         
 [70] grid_4.3.1              Rtsne_0.17              cluster_2.1.6          
 [73] reshape2_1.4.4          generics_0.1.3          hdf5r_1.3.11           
 [76] gtable_0.3.5            spatstat.data_3.1-2     tidyr_1.3.1            
 [79] data.table_1.15.4       car_3.1-2               utf8_1.2.4             
 [82] XVector_0.42.0          BiocGenerics_0.48.1     spatstat.geom_3.3-2    
 [85] RcppAnnoy_0.0.22        ggrepel_0.9.5           pillar_1.9.0           
 [88] stringr_1.5.1           spam_2.10-0             RcppHNSW_0.6.0         
 [91] later_1.3.2             splines_4.3.1           lattice_0.22-5         
 [94] bit_4.0.5               survival_3.7-0          deldir_1.0-9           
 [97] tidyselect_1.2.1        miniUI_0.1.1.1          pbapply_1.7-2          
[100] gridExtra_2.3           IRanges_2.36.0          scattermore_1.2        
[103] stats4_4.3.1            Biobase_2.62.0          matrixStats_1.3.0      
[106] visNetwork_2.1.2        stringi_1.8.4           lazyeval_0.2.2         
[109] codetools_0.2-20        data.tree_1.1.0         tibble_3.2.1           
[112] packcircles_0.3.6       cli_3.6.3               uwot_0.2.2             
[115] geometry_0.4.7          xtable_1.8-4            randomForestSRC_3.3.1  
[118] munsell_0.5.1           Rcpp_1.0.12             GenomeInfoDb_1.38.8    
[121] globals_0.16.3          spatstat.random_3.3-1   png_0.1-8              
[124] fastcluster_1.2.6       spatstat.univar_3.0-0   parallel_4.3.1         
[127] ggplot2_3.5.1           dotCall64_1.1-1         bitops_1.0-7           
[130] listenv_0.9.1           scales_1.3.0            ggridges_0.5.6         
[133] crayon_1.5.3            leiden_0.4.3.1          purrr_1.0.2            
[136] rlang_1.1.4             cowplot_1.1.3

Could you please take a look to see what the problem might be? Thanks!

Also, the code remotes::install_github("mojaveazure/seurat-object", "seurat5", quiet = TRUE) seems not to be correct.

> remotes::install_github("mojaveazure/seurat-object", "seurat5", quiet = TRUE)
Error: Failed to install 'unknown package' from GitHub:
  HTTP error 404.
  No commit found for the ref seurat5

  Did you spell the repo owner and repo name correctly?
  - If spelling is correct, check that you have the required permissions to access the repo.