subscript out of bounds when trying to run PrepSCTFindMarkers()

Hello,

For my analysis, I have 12 samples that I merged before performing the normalization with SCTransform. When pursuing the analysis, I saw that I could use only 3 samples to perform a good annotation of the clusters, because for the other samples, there was an abnormal expression of a lot of genes and it was complicated to conclude. So I splitted the merged seurat object and merge again only the 3 seurat object that were interesting for the annotation. Then, I wanted to check the markers of these clusters using FindAllMarkers(). Since the data was normalized with SCTransform, I had first to perform a PrepSCTFindMarkers(). However, each time I tried to run this, I got the error:

Found 3 SCT models. Recorrecting SCT counts using minimum median counts: 5574 | | 0 % ~calculating Error in .subscript.2ary(x, i, j, drop = TRUE) : subscript out of bounds

Here is my code:

# Load data
merged_data <- base::readRDS("merged_data.rds")

# Split data
obj.list <- Seurat::SplitObject(merged_data, split.by = "orig.ident")

# Merge control datasets
control <- merge(obj.list$control1, y = c(obj.list$control2, obj.list$control3))

# Prepare the data to be run with the function FindAllMarkers
control <- Seurat::PrepSCTFindMarkers(control, assay = "SCT", verbose = TRUE)

Here is my sessionInfo()

> sessionInfo()
R version 4.3.1 (2023-06-16)
Platform: x86_64-conda-linux-gnu (64-bit)
Running under: Ubuntu 20.04.6 LTS

Matrix products: default
BLAS/LAPACK: /shared/software/miniconda/envs/r-4.3.1/lib/libopenblasp-r0.3.24.so;  LAPACK version 3.11.0

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8       
 [4] LC_COLLATE=en_US.UTF-8     LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                  LC_ADDRESS=C              
[10] LC_TELEPHONE=C             LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

time zone: Europe/Paris
tzcode source: system (glibc)

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] loupeR_1.1.0

loaded via a namespace (and not attached):
  [1] RcppAnnoy_0.0.22            splines_4.3.1               later_1.3.2                
  [4] BiocIO_1.12.0               bitops_1.0-7                tibble_3.2.1               
  [7] polyclip_1.10-6             XML_3.99-0.17               fastDummies_1.7.3          
 [10] lifecycle_1.0.4             scDblFinder_1.16.0          edgeR_4.0.16               
 [13] hdf5r_1.3.10                globals_0.16.3              lattice_0.22-6             
 [16] MASS_7.3-60.0.1             magrittr_2.0.3              rmarkdown_2.27             
 [19] limma_3.58.1                plotly_4.10.4               yaml_2.3.9                 
 [22] metapod_1.10.1              httpuv_1.6.15               Seurat_5.1.0               
 [25] sctransform_0.4.1           askpass_1.2.0               spam_2.10-0                
 [28] sp_2.1-4                    spatstat.sparse_3.1-0       reticulate_1.38.0          
 [31] cowplot_1.1.3               pbapply_1.7-2               RColorBrewer_1.1-3         
 [34] abind_1.4-5                 zlibbioc_1.48.2             Rtsne_0.17                 
 [37] GenomicRanges_1.54.1        purrr_1.0.2                 BiocGenerics_0.48.1        
 [40] RCurl_1.98-1.14             GenomeInfoDbData_1.2.11     IRanges_2.36.0             
 [43] S4Vectors_0.40.2            ggrepel_0.9.5               irlba_2.3.5.1              
 [46] listenv_0.9.1               spatstat.utils_3.0-5        umap_0.2.10.0              
 [49] goftest_1.2-3               RSpectra_0.16-1             dqrng_0.4.1                
 [52] spatstat.random_3.2-3       fitdistrplus_1.1-11         parallelly_1.37.1          
 [55] DelayedMatrixStats_1.24.0   leiden_0.4.3.1              codetools_0.2-20           
 [58] DelayedArray_0.28.0         scuttle_1.12.0              tidyselect_1.2.1           
 [61] ScaledMatrix_1.10.0         viridis_0.6.5               matrixStats_1.3.0          
 [64] stats4_4.3.1                spatstat.explore_3.2-7      GenomicAlignments_1.38.2   
 [67] jsonlite_1.8.8              BiocNeighbors_1.20.2        progressr_0.14.0           
 [70] ggridges_0.5.6              survival_3.7-0              scater_1.30.1              
 [73] tools_4.3.1                 ica_1.0-3                   Rcpp_1.0.12                
 [76] glue_1.7.0                  gridExtra_2.3               SparseArray_1.2.4          
 [79] xfun_0.45                   MatrixGenerics_1.14.0       GenomeInfoDb_1.38.8        
 [82] dplyr_1.1.4                 fastmap_1.2.0               bluster_1.12.0             
 [85] fansi_1.0.6                 openssl_2.2.0               digest_0.6.36              
 [88] rsvd_1.0.5                  R6_2.5.1                    mime_0.12                  
 [91] colorspace_2.1-0            scattermore_1.2             tensor_1.5                 
 [94] spatstat.data_3.1-2         utf8_1.2.4                  tidyr_1.3.1                
 [97] generics_0.1.3              renv_1.0.7                  data.table_1.15.4          
[100] rtracklayer_1.62.0          httr_1.4.7                  htmlwidgets_1.6.4          
[103] S4Arrays_1.2.1              uwot_0.2.2                  pkgconfig_2.0.3            
[106] gtable_0.3.5                lmtest_0.9-40               SingleCellExperiment_1.24.0
[109] XVector_0.42.0              htmltools_0.5.8.1           dotCall64_1.1-1            
[112] SeuratObject_5.0.2          scales_1.3.0                Biobase_2.62.0             
[115] png_0.1-8                   scran_1.30.2                knitr_1.48                 
[118] rstudioapi_0.16.0           reshape2_1.4.4              rjson_0.2.21               
[121] nlme_3.1-165                zoo_1.8-12                  stringr_1.5.1              
[124] KernSmooth_2.23-24          parallel_4.3.1              miniUI_0.1.1.1             
[127] vipor_0.4.7                 restfulr_0.0.15             pillar_1.9.0               
[130] grid_4.3.1                  vctrs_0.6.5                 RANN_2.6.1                 
[133] promises_1.3.0              BiocSingular_1.18.0         beachmat_2.18.1            
[136] xtable_1.8-4                cluster_2.1.6               beeswarm_0.4.0             
[139] evaluate_0.24.0             locfit_1.5-9.10             cli_3.6.3                  
[142] compiler_4.3.1              Rsamtools_2.18.0            rlang_1.1.4                
[145] crayon_1.5.3                future.apply_1.11.2         plyr_1.8.9                 
[148] ggbeeswarm_0.7.2            stringi_1.8.4               viridisLite_0.4.2          
[151] deldir_2.0-4                BiocParallel_1.36.0         munsell_0.5.1              
[154] Biostrings_2.70.3           lazyeval_0.2.2              spatstat.geom_3.2-9        
[157] Matrix_1.6-5                RcppHNSW_0.6.0              patchwork_1.2.0            
[160] bit64_4.0.5                 sparseMatrixStats_1.14.0    future_1.33.2              
[163] ggplot2_3.5.1               statmod_1.5.0               shiny_1.8.1.1              
[166] SummarizedExperiment_1.32.0 ROCR_1.0-11                 igraph_2.0.3               
[169] bit_4.0.5                   xgboost_1.7.7.1

I suspect that splitting then merging some samples without the others is the problem. So, is there any way to do this correctly ?

Thanks a lot for your help,

satijalab / seurat

subscript out of bounds when trying to run PrepSCTFindMarkers() #9112