satijalab / seurat

R toolkit for single cell genomics
http://www.satijalab.org/seurat
Other
2.26k stars 908 forks source link

Reclustering of spatial data in Seurat V5 not working #9378

Open JoyOtten opened 5 days ago

JoyOtten commented 5 days ago

Hi,

I've been trying to replicate my previous scripts and data pipeline from earlier versions of Seurat. I'm trying to recluster the clusters I found earlier. However, I'm encountering a particular error code namely:

cluster <- FindNeighbors(region, reduction = "pca", dims = 1:30) Computing nearest neighbor graph Computing SNN Error in validObject(object = x) : invalid class “Seurat” object: 1: All cells in images must be present in the Seurat object invalid class “Seurat” object: 2: All cells in images must be present in the Seurat object invalid class “Seurat” object: 3: All cells in images must be present in the Seurat object invalid class “Seurat” object: 4: All cells in images must be present in the Seurat object invalid class “Seurat” object: 5: All cells in images must be present in the Seurat object invalid class “Seurat” object: 6: All cells in images must be present in the Seurat object invalid class “Seurat” object: 7: All cells in images must be present in the Seurat object invalid class “Seurat” object: 8: All cells in images must be present in the Seurat object invalid class “Seurat” object: 9: All cells in images must be present in the Seurat object invalid class “Seurat” object: 10: All cells in images must be present in the Seurat object invalid class “Seurat” object: 11: A

What I take from this is that it seems that Seurat built-in integrity is only accepting that all cells in the images must be present. However, by subsetting my initial regions, I lose on-purpose cells/spots in the data. Therefore, I would like to find a way around or that the Seurat package is updated so this will become much easier. I'm working at the same time to find a way around it. If you know any, please let me know.

samuel-marsh commented 5 days ago

Hi,

Not member of dev team but hopefully can be helpful. Are you trying to work with object created with earlier version of Seurat or did you recreate object starting in V5?

Best, Sam

JoyOtten commented 5 days ago

Hi Sam,

I created my object starting in V5. I just found a way around by removing all the images in the Seurat Object. The error is definitely coming from the image object in there. However, I have no idea how to resolve this error. This is the code to work around it: seurat@images <- list()

samuel-marsh commented 5 days ago

Hi @JoyOtten,

Can you post the full code you are running and a copy of the output of sessionInfo?

Best, Sam

JoyOtten commented 5 days ago

Hi Sam,

Ofcourse, one of the things you should know is that I have multiple Spatial Transcriptomics/Visium objects in the seurat object.

libraries

library(reticulate) use_python("/PHShome/je637/anaconda3/envs/r-reticulate/bin/python", required = TRUE) library(Seurat, lib.loc = "/opt/R/4.4.0/lib/R/library") library(Matrix) library(readr) library(dplyr) library(stringr) library(ggplot2) library(SoupX, lib.loc = "/opt/R/4.4.0/lib/R/library") library(leiden) library(clustree) set.seed(1234)

Add sample information

even <- seq(2,16,2) uneven <- seq(1,16,2) samples <- unique(total@meta.data$samples) total@meta.data$condition <- NA total@meta.data$age <- NA total@meta.data$slide <- NA total@meta.data$position <- NA for(i in samples){ if(i %in% even){ total@meta.data$condition <- "SD" } if(i %in% uneven){ total@meta.data$condition <- "CTRL" } if(i %in% c(1:8)){ total@meta.data$age <- "2.5month" } if(i %in% c(9:16)){ total@meta.data$age <- "3.5month" } if(i %in% c(1:4)){ total@meta.data$slide <- 1 } if(i %in% c(5:8)){ total@meta.data$slide <- 2 } if(i %in% c(9:12)){ total@meta.data$slide <- 3 } if(i %in% c(13:16)){ total@meta.data$slide <- 4 } if(i %in% c(1,5,9,13)){ total@meta.data$position <- 1 } if(i %in% c(2,6,10,14)){ total@meta.data$position <- 2 } if(i %in% c(3,7,11,15)){ total@meta.data$position <- 3 } if(i %in% c(4,8,12,16)){ total@meta.data$position <- 4 } }

Splitting the data so that it will result in the regions detected earlier

regions_total <- list() res <- c("0.3") for(i in res){ message(i) resolution <- paste0("SCT_snn_res.", i) Idents(total) <- resolution regions <- SplitObject(total, split.by = resolution) regions_total <- append(regions_total, list(regions)) } names(regions_total) <- res

QC analysis per cluster

Elbow plop

clusters <- unique(Idents(total)) for(i in clusters){ message(i) region <- total_res[[i]] region@images[13:24] <- NULL seurat_cells <- colnames(region)

Remove the images slot if it's not relevant to this part of the analysis

region@images <- list() region <- SCTransform(region, assay = "Spatial", return.only.var.genes = FALSE, verbose = FALSE) cluster <- FindNeighbors(region, reduction = "pca", dims = 1:30) pct <- cluster[["pca"]]@stdev / sum(cluster[["pca"]]@stdev) * 100 cumu <- cumsum(pct) co1 <- which(cumu > 90 & pct < 5)[1] message(co1) co2 <- sort(which((pct[1:length(pct) - 1] - pct[2:length(pct)]) > 0.1), decreasing = T)[1] + 1 message(co2) res_temp <- append(res_temp, list(co2)) } names(res_temp) <- clusters

PCA 40 exhibits cumulative perfect greater than 90% for all regions. The last

PCA where the change of % of variation is more than 0.1% is PC18 for all regions.

Resolution 0.3

region <- regions_total[["0.3"]] cluster_results_0.3 <- list() clusters <- clusters[1:10]

Cluster 11 is not detected in all samples and therefor not further subclustered

for(i in clusters){ message(i) number <- region[[i]] number@images <- list() cols_to_keep <- grep("SCT_snn", colnames(number@meta.data), invert = TRUE, value = TRUE)

Subset the meta.data to keep only the desired columns

number@meta.data <- number@meta.data[, cols_to_keep]

treat each cluster separate

number <- SCTransform(number, assay = "Spatial", return.only.var.genes = FALSE, verbose = FALSE) cluster <- FindNeighbors(number, reduction = "pca", dims = 1:18) cluster <- FindClusters(cluster, algorithm = 4, verbose = FALSE, method = "igraph", resolution = c(0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.8, 1)) clustered <- Seurat::RunUMAP(cluster, reduction = "pca", dims = 1:18) cluster_results_0.3 <- append(cluster_results_0.3, list(clustered)) } names(cluster_results_0.3) <- clusters

sessionInfo() R version 4.4.0 (2024-04-24) Platform: x86_64-pc-linux-gnu Running under: Rocky Linux 9.3 (Blue Onyx)

Matrix products: default BLAS/LAPACK: FlexiBLAS OPENBLAS-OPENMP; LAPACK version 3.9.0

locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 LC_MONETARY=en_US.UTF-8
[6] LC_MESSAGES=en_US.UTF-8 LC_PAPER=en_US.UTF-8 LC_NAME=C LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

time zone: America/New_York tzcode source: system (glibc)

attached base packages: [1] stats4 stats graphics grDevices utils datasets methods base

other attached packages: [1] clustree_0.5.1 ggraph_2.2.1 scuttle_1.14.0 SpotSweeper_1.0.2
[5] leiden_0.4.3.1 ggplot2_3.5.1 stringr_1.5.1 dplyr_1.1.4
[9] readr_2.1.5 Matrix_1.7-0 Seurat_5.1.0 SeuratObject_5.0.2
[13] sp_2.1-4 SpatialExperiment_1.14.0 SingleCellExperiment_1.26.0 SummarizedExperiment_1.34.0 [17] Biobase_2.64.0 GenomicRanges_1.56.0 GenomeInfoDb_1.40.1 IRanges_2.38.1
[21] S4Vectors_0.42.1 BiocGenerics_0.50.0 MatrixGenerics_1.16.0 matrixStats_1.4.1

loaded via a namespace (and not attached): [1] RcppAnnoy_0.0.22 splines_4.4.0 later_1.3.2 tibble_3.2.1 polyclip_1.10-6
[6] fastDummies_1.7.3 lifecycle_1.0.4 hdf5r_1.3.11 globals_0.16.3 lattice_0.22-6
[11] MASS_7.3-60.2 backports_1.5.0 magrittr_2.0.3 plotly_4.10.4 httpuv_1.6.15
[16] glmGamPoi_1.16.0 sctransform_0.4.1 spam_2.10-0 spatstat.sparse_3.0-3 reticulate_1.37.0
[21] cowplot_1.1.3 pbapply_1.7-2 RColorBrewer_1.1-3 abind_1.4-8 zlibbioc_1.50.0
[26] Rtsne_0.17 purrr_1.0.2 tweenr_2.0.3 rappdirs_0.3.3 GenomeInfoDbData_1.2.12
[31] ggrepel_0.9.5 irlba_2.3.5.1 listenv_0.9.1 spatstat.utils_3.0-4 terra_1.7-78
[36] goftest_1.2-3 RSpectra_0.16-2 spatstat.random_3.2-3 fitdistrplus_1.1-11 parallelly_1.37.1
[41] DelayedMatrixStats_1.26.0 codetools_0.2-20 DelayedArray_0.30.1 ggforce_0.4.2 tidyselect_1.2.1
[46] UCSC.utils_1.0.0 farver_2.1.2 viridis_0.6.5 spatstat.explore_3.2-7 jsonlite_1.8.9
[51] escheR_1.4.0 BiocNeighbors_1.22.0 tidygraph_1.3.1 progressr_0.14.0 ggridges_0.5.6
[56] survival_3.7-0 tools_4.4.0 ica_1.0-3 Rcpp_1.0.13 glue_1.7.0
[61] gridExtra_2.3 SparseArray_1.4.8 withr_3.0.1 fastmap_1.2.0 fansi_1.0.6
[66] digest_0.6.35 R6_2.5.1 mime_0.12 colorspace_2.1-1 scattermore_1.2
[71] tensor_1.5 spatstat.data_3.0-4 utf8_1.2.4 tidyr_1.3.1 generics_0.1.3
[76] data.table_1.15.4 graphlayouts_1.1.1 httr_1.4.7 htmlwidgets_1.6.4 S4Arrays_1.4.1
[81] uwot_0.2.2 pkgconfig_2.0.3 gtable_0.3.5 lmtest_0.9-40 XVector_0.44.0
[86] htmltools_0.5.8.1 dotCall64_1.1-1 scales_1.3.0 png_0.1-8 rstudioapi_0.16.0
[91] tzdb_0.4.0 reshape2_1.4.4 rjson_0.2.23 checkmate_2.3.2 nlme_3.1-165
[96] cachem_1.1.0 zoo_1.8-12 KernSmooth_2.23-24 parallel_4.4.0 miniUI_0.1.1.1
[101] pillar_1.9.0 grid_4.4.0 vctrs_0.6.5 RANN_2.6.2 promises_1.3.0
[106] beachmat_2.20.0 xtable_1.8-4 cluster_2.1.6 magick_2.8.3 cli_3.6.3
[111] compiler_4.4.0 rlang_1.1.4 crayon_1.5.3 future.apply_1.11.2 labeling_0.4.3
[116] spatialEco_2.0-2 plyr_1.8.9 stringi_1.8.4 viridisLite_0.4.2 deldir_2.0-4
[121] BiocParallel_1.38.0 munsell_0.5.1 lazyeval_0.2.2 spatstat.geom_3.2-9 RcppHNSW_0.6.0
[126] hms_1.1.3 patchwork_1.2.0 bit64_4.5.2 sparseMatrixStats_1.16.0 future_1.33.2
[131] shiny_1.8.1.1 ROCR_1.0-11 igraph_2.0.3 memoise_2.0.1 bit_4.5.0

wangbenwang123 commented 5 days ago

if you want to re-cluster specific subpopulations from your clustering results, you'll need to subset the Seurat object and then re-run runPCA afterward.

samuel-marsh commented 5 days ago

Hi @wangbenwang123,

Yes, that is true but it does not address the errors that are occurring in this particular instance. Additionally, more than just PCA needs to be re-run.

@JoyOtten I believe the issues are likely coming from the splitting of your object and some of the modifications there. In Seurat V5 SplitObject is no longer used and it is layers within that are split. In this way the object is expected to contain all of the cells/images overall but layers are split as needed. I would check out some of the vignettes specific to V5 and how layers are split and handled to see if adapting your code solves the issue.

Best, Sam

JoyOtten commented 4 days ago

I already found a way around that works for me but that each different sample is now stored in a separate layer in Seurat V5 is a bit annoying