satijalab / seurat

R toolkit for single cell genomics
http://www.satijalab.org/seurat
Other
2.29k stars 915 forks source link

SaveSeuratRds: Cannot find any of the layer files specified #8313

Open NikicaJEa opened 9 months ago

NikicaJEa commented 9 months ago

Hi. I am experiencing multiple issues with SaveSeuratRds. When using SaveSeuratRds everything looks fine, the function is not complaining. But when I load the same saved object with LoadSeuratRds I get Warning message: Cannot find any of the layer files specified

Here is the code example:

library(Seurat)
library(BPCells)
counts_mat <- open_matrix_dir(dir = "./data/cartridge1_1/")

cells <- CreateSeuratObject(counts_mat) 
cells
#An object of class Seurat 
#20627 features across 40871 samples within 1 assay 
#Active assay: RNA (20627 features, 0 variable features)
#1 layer present: counts

SaveSeuratRds(
  object = cells,
  file = "./output_folder/obj.Rds",
  move = TRUE,
  relative = TRUE
)
#Warning message:
#Trying to move ‘./data/cartridge1_1’ to itself, skipping 

cells_loaded <- LoadSeuratRds("./output_folder/obj.Rds")
#Warning message:
#Cannot find any of the layer files specified 

cells_loaded
#An object of class Seurat 
#20627 features across 40871 samples within 1 assay 
#Active assay: RNA (20627 features, 0 variable features)
# 0 layers present: 

Another related issue to SaveSeuratRds: If I do SaveSeuratRds with move =TRUE, relative = FALSE, then it kind of works - the LoadSeuratRds loaded object has the counts and data layers present BUT the data layer is identical to counts layer. When I inspect the counts and data layers they have the same Queued Operations (although they shouldnt).Somehow the layer Queued Operations were not transferred correctly during SaveSeuratRds.

Would be great if someone could have look into this. Many thanks!

sessionInfo()
R version 4.3.2 (2023-10-31)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 20.04.6 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3 
LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/liblapack.so.3;  LAPACK version 3.9.0

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=de_AT.UTF-8        LC_COLLATE=en_US.UTF-8     LC_MONETARY=de_AT.UTF-8   
 [6] LC_MESSAGES=en_US.UTF-8    LC_PAPER=de_AT.UTF-8       LC_NAME=C                  LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=de_AT.UTF-8 LC_IDENTIFICATION=C       

time zone: Europe/Vienna
tzcode source: system (glibc)

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] BPCells_0.1.0      Seurat_5.0.1       SeuratObject_5.0.1 sp_1.6-1          

loaded via a namespace (and not attached):
  [1] bitops_1.0-7            deldir_1.0-9            pbapply_1.7-0           gridExtra_2.3           rlang_1.1.1            
  [6] magrittr_2.0.3          RcppAnnoy_0.0.20        matrixStats_0.63.0      ggridges_0.5.4          compiler_4.3.2         
 [11] spatstat.geom_3.2-1     png_0.1-8               vctrs_0.6.2             reshape2_1.4.4          stringr_1.5.0          
 [16] pkgconfig_2.0.3         fastmap_1.1.1           XVector_0.40.0          ellipsis_0.3.2          utf8_1.2.3             
 [21] promises_1.2.0.1        purrr_1.0.1             zlibbioc_1.46.0         GenomeInfoDb_1.36.0     jsonlite_1.8.7         
 [26] goftest_1.2-3           later_1.3.1             spatstat.utils_3.0-3    irlba_2.3.5.1           parallel_4.3.2         
 [31] cluster_2.1.4           R6_2.5.1                ica_1.0-3               stringi_1.7.12          RColorBrewer_1.1-3     
 [36] spatstat.data_3.0-1     reticulate_1.15         parallelly_1.36.0       GenomicRanges_1.52.0    lmtest_0.9-40          
 [41] scattermore_1.2         Rcpp_1.0.10             tensor_1.5              future.apply_1.11.0     zoo_1.8-12             
 [46] IRanges_2.34.0          sctransform_0.4.1       httpuv_1.6.11           Matrix_1.6-4            splines_4.3.2          
 [51] igraph_1.4.3            tidyselect_1.2.0        rstudioapi_0.14         abind_1.4-5             spatstat.random_3.1-5  
 [56] codetools_0.2-19        miniUI_0.1.1.1          spatstat.explore_3.2-1  listenv_0.9.0           lattice_0.21-8         
 [61] tibble_3.2.1            plyr_1.8.8              shiny_1.7.4             ROCR_1.0-11             Rtsne_0.16             
 [66] future_1.32.0           fastDummies_1.6.3       survival_3.5-5          polyclip_1.10-4         fitdistrplus_1.1-11    
 [71] pillar_1.9.0            KernSmooth_2.23-21      stats4_4.3.2            plotly_4.10.1           generics_0.1.3         
 [76] RCurl_1.98-1.12         RcppHNSW_0.4.1          S4Vectors_0.38.1        ggplot2_3.4.2           munsell_0.5.0          
 [81] scales_1.2.1            globals_0.16.2          xtable_1.8-4            glue_1.6.2              lazyeval_0.2.2         
 [86] tools_4.3.2             data.table_1.14.8       RSpectra_0.16-1         RANN_2.6.1              fs_1.6.3               
 [91] leiden_0.4.3            dotCall64_1.0-2         cowplot_1.1.1           grid_4.3.2              tidyr_1.3.0            
 [96] colorspace_2.1-0        GenomeInfoDbData_1.2.10 nlme_3.1-162            patchwork_1.1.2         cli_3.6.1              
[101] spatstat.sparse_3.0-1   spam_2.9-1              fansi_1.0.4             viridisLite_0.4.2       dplyr_1.1.2            
[106] uwot_0.1.14             gtable_0.3.3            digest_0.6.31           BiocGenerics_0.46.0     progressr_0.13.0       
[111] ggrepel_0.9.3           htmlwidgets_1.6.2       htmltools_0.5.5         lifecycle_1.0.3         httr_1.4.6             
[116] mime_0.12               MASS_7.3-60  
KatarinaLalatovic commented 9 months ago

I am curious about this as well

Gesmira commented 9 months ago

Hi, Thanks for pointing this out! We are actively looking into a fix. As you mentioned, I would avoid using the relativeparameter for now.

As a quick fix, you can get around this by saving the normalized layer on-disk again. It seems that, as you mentioned, the issue specifically happens when the data layer is a transformation of the current BPCells on-disk matrix. If you save the normalized data layer directly from the sparse matrix, it will ensure it is saved.

This example works normally for me:

library(Seurat)
library(BPCells)
library(SeuratData)

pbmc3k[["RNA"]]$counts <- write_matrix_dir(pbmc3k[["RNA"]]$counts, dir = "/brahms/mollag/practice/pbmc_counts_new_issue/")
pbmc3k <- NormalizeData(pbmc3k)
pbmc3k[["RNA"]]$data <- write_matrix_dir(pbmc3k[["RNA"]]$data, dir = "/brahms/mollag/practice/pbmc_data_new_issue/")

identical(as(pbmc3k[["RNA"]]$counts, "dgCMatrix"), as(pbmc3k[["RNA"]]$data, "dgCMatrix"))
# should return FALSE and does

SaveSeuratRds(
    object = pbmc3k,
    file = "my/path/obj.Rds")

cells_loaded <- LoadSeuratRds("my/path/obj.Rds")
identical(as(cells_loaded[["RNA"]]$counts, "dgCMatrix"), as(cells_loaded[["RNA"]]$data, "dgCMatrix"))
# also returns FALSE as expected
NikicaJEa commented 9 months ago

That sounds great. Thanks for looking into it!

NikicaJEa commented 9 months ago

Any progress with this bug? At least fixing the bug with relative parameter, because relative=TRUE doesn't work and is of high importance to the people that want to share seurat objects with someone.

Gesmira commented 9 months ago

Hi @NikicaJEa, Apologies that we have not yet gotten a chance to look into this deeply as we prepare for the Seurat 5.0.2 release. In the meantime, does the workaround I shared work for you to be able to share the objects with other poeple?

NikicaJEa commented 8 months ago

Hi. The example you kindly shared works nicely with saving the count and normalized count on-disk matrices but does not help if I want to share a seurat object with someone externally. From the example that you shared if I do cells_loaded <- LoadSeuratRds("my/path/obj.Rds") then in order to access the disk-on matrix I will need access to dir = "/brahms/mollag/practice/pbmc_counts_new_issue/". To my understanding, anyone who doesn't have access to this exact folder path won't be able load the seurat object i.e. the on-disk matrices. I guess this is where the relative parameter would come really in handy. Would you agree?

NikicaJEa commented 8 months ago

any update?

NikicaJEa commented 7 months ago

Guys, please just make the the SaveSeuratRds() function work with relative = TRUE. Without this feature the function is not as nearly useful as it could be. Thanks.

babiddy commented 7 months ago

The work around I have used for this scenario is to modify the directory path of the BPCells matrix itself before creating the Seurat object.

library(Seurat)
library(BPCells)
counts_mat <- open_matrix_dir(dir = "./data/cartridge1_1/")

counts_mat@dir <- "relative/path/here"

cells <- CreateSeuratObject(counts_mat)