enblacar / SCpubr

Generate high quality, publication ready visualizations for single cell transcriptomics data.
https://enblacar.github.io/SCpubr-book/
GNU General Public License v3.0
150 stars 12 forks source link

Seurat v5 Support #70

Open shahrozeabbas opened 1 month ago

shahrozeabbas commented 1 month ago

Hi,

Love your package and hope to continue using it!

I believe SCpubr has support for Seuratv5 but I was having some trouble with the do_DimPlot function. Am I missing something?

> object <- readRDS('objects/merged.rds')
> p <- object %>% do_DimPlot(reduction='X_umap', group.by='leiden')
Loading required package: BPCells
Error in `asMethod()`:
! Error converting IterableMatrix to dgCMatrix
• dgCMatrix objects cannot hold more than 2^31 non-zero entries
• Input matrix has 3239391600 entries
Run `rlang::last_trace()` to see where the error occurred.
> sessionInfo()
R version 4.4.1 (2024-06-14)
Platform: x86_64-pc-linux-gnu
Running under: Rocky Linux 8.7 (Green Obsidian)

Matrix products: default
BLAS/LAPACK: /usr/local/intel/2022.1.2.146/mkl/2022.0.2/lib/intel64/libmkl_rt.so.2;  LAPACK version 3.9.0

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

time zone: America/New_York
tzcode source: system (glibc)

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] BPCells_0.2.0      dplyr_1.1.4        ggplot2_3.5.1      SCpubr_2.0.2      
[5] Seurat_5.1.0       SeuratObject_5.0.2 sp_2.1-4          

loaded via a namespace (and not attached):
  [1] RColorBrewer_1.1-3      jsonlite_1.8.8          magrittr_2.0.3         
  [4] spatstat.utils_3.0-5    zlibbioc_1.50.0         fs_1.6.4               
  [7] vctrs_0.6.5             ROCR_1.0-11             memoise_2.0.1          
 [10] spatstat.explore_3.2-7  htmltools_0.5.8.1       forcats_1.0.0          
 [13] gridGraphics_0.5-1      sctransform_0.4.1       parallelly_1.37.1      
 [16] KernSmooth_2.23-24      htmlwidgets_1.6.4       ica_1.0-3              
 [19] plyr_1.8.9              plotly_4.10.4           zoo_1.8-12             
 [22] cachem_1.1.0            igraph_2.0.3            mime_0.12              
 [25] lifecycle_1.0.4         pkgconfig_2.0.3         Matrix_1.7-0           
 [28] R6_2.5.1                fastmap_1.2.0           GenomeInfoDbData_1.2.12
 [31] MatrixGenerics_1.16.0   fitdistrplus_1.1-11     future_1.33.2          
 [34] shiny_1.8.1.1           digest_0.6.36           colorspace_2.1-0       
 [37] S4Vectors_0.42.0        patchwork_1.2.0         tensor_1.5             
 [40] RSpectra_0.16-1         irlba_2.3.5.1           GenomicRanges_1.56.1   
 [43] labeling_0.4.3          progressr_0.14.0        fansi_1.0.6            
 [46] spatstat.sparse_3.1-0   httr_1.4.7              polyclip_1.10-6        
 [49] abind_1.4-5             compiler_4.4.1          withr_3.0.0            
 [52] viridis_0.6.5           fastDummies_1.7.3       MASS_7.3-61            
 [55] tools_4.4.1             lmtest_0.9-40           httpuv_1.6.15          
 [58] future.apply_1.11.2     goftest_1.2-3           glue_1.7.0             
 [61] nlme_3.1-165            promises_1.3.0          grid_4.4.1             
 [64] Rtsne_0.17              cluster_2.1.6           reshape2_1.4.4         
 [67] generics_0.1.3          gtable_0.3.5            spatstat.data_3.1-2    
 [70] tidyr_1.3.1             data.table_1.15.4       XVector_0.44.0         
 [73] utf8_1.2.4              BiocGenerics_0.50.0     spatstat.geom_3.2-9    
 [76] RcppAnnoy_0.0.22        ggrepel_0.9.5           RANN_2.6.1             
 [79] pillar_1.9.0            stringr_1.5.1           yulab.utils_0.1.4      
 [82] spam_2.10-0             RcppHNSW_0.6.0          later_1.3.2            
 [85] splines_4.4.1           lattice_0.22-6          survival_3.7-0         
 [88] deldir_2.0-4            tidyselect_1.2.1        miniUI_0.1.1.1         
 [91] pbapply_1.7-2           gridExtra_2.3           IRanges_2.38.0         
 [94] scattermore_1.2         stats4_4.4.1            matrixStats_1.3.0      
 [97] UCSC.utils_1.0.0        stringi_1.8.4           lazyeval_0.2.2         
[100] codetools_0.2-20        tibble_3.2.1            ggplotify_0.1.2        
[103] cli_3.6.3               uwot_0.2.2              xtable_1.8-4           
[106] reticulate_1.38.0       munsell_0.5.1           GenomeInfoDb_1.40.1    
[109] Rcpp_1.0.12             globals_0.16.3          spatstat.random_3.2-3  
[112] png_0.1-8               parallel_4.4.1          assertthat_0.2.1       
[115] dotCall64_1.1-1         listenv_0.9.1           viridisLite_0.4.2      
[118] scales_1.3.0            ggridges_0.5.6          leiden_0.4.3.1         
[121] purrr_1.0.2             rlang_1.1.4             cowplot_1.1.3          
enblacar commented 1 month ago

Hi @shahrozeabbas,

Thank you for using my package!

I am sorry you are facing this issue. I did incorporate some Seurat v5 support back when it was released as a Beta. However, I think it was mostly focused on handling Assay5 vs Assay assay objects and how the different data slots were retrieved. So chances are more support is still missing!

In this case, it seems its due to trying to work with a very large file, for which I really did not implement a fix as far as I remember. Is this correct? As a starters, does computing in on a subset of the data solve the issue?

It is the first time facing this issue, can you also provide the output of rlang::last_trace()?

Thanks for your help! Enrique

shahrozeabbas commented 1 month ago

I believe the issue stems from the handling of count matrices. Seuratv5 changes the way data slots are retrieved but it also allows support for the IterableMatrix from BPcells. This is what makes it possible to load larger datasets into R and Seurat. For plotting, it seems that the functions within SCpubr are trying to cast the IterableMatrix class to a dgCMatrix which has the 2^31 size limit in R. I think Seurat plotting functions are able to handle this which makes sense, but somewhere within SCpubr if you're using asMethod() to convert the IterableMatrix, it will cause problems for larger datasets that normally wouldn't fit into R.

I will have to reproduce the error later today and update you with the output, but thanks for the quick reply!

enblacar commented 3 weeks ago

Hi @shahrozeabbas,

Thanks for waiting! I am currently under a heavy workload at the PhD. I will come back to this issue as soon as possible!

Enrique