ncborcherding / escape

Easy single cell analysis platform for enrichment
https://www.borch.dev/uploads/screpertoire/articles/running_escape
MIT License
152 stars 20 forks source link

Error in cnts[gene.set, ] : subscript out of bounds #101

Closed RosyGH closed 5 months ago

RosyGH commented 5 months ago

geyserEnrichment(seurat_ex,

Thanks a lot for developing this useful tool! I can't figure out what is wrong, but if I change my codes to see another gene.set, it will run without any errors:

geyserEnrichment(seurat_ex,

It seems that the form of the gene.set name affects the result.

ncborcherding commented 5 months ago

Hey @RosyGH,

Happy to help troubleshoot, however, I will need more information, like the output of sessionInfo() and a reproducible example (see see Hadley's rundown of a reproducible example ). If data privacy is a concern, please make an example with the built-in data.

Thanks, Nick

RosyGH commented 5 months ago

sessionInfo() R version 4.3.1 (2023-06-16 ucrt) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows 11 x64 (build 22631)

Matrix products: default

locale: [1] LC_COLLATE=Chinese (Simplified)_China.utf8 [2] LC_CTYPE=Chinese (Simplified)_China.utf8
[3] LC_MONETARY=Chinese (Simplified)_China.utf8 [4] LC_NUMERIC=C
[5] LC_TIME=Chinese (Simplified)_China.utf8

time zone: Asia/Shanghai tzcode source: internal

attached base packages: [1] stats graphics grDevices utils datasets [6] methods base

other attached packages: [1] dittoSeq_1.17.0 Seurat_5.0.1
[3] SeuratObject_5.0.1 sp_2.1-2
[5] escape_2.0.0 ggplot2_3.5.1
[7] devtools_2.4.5 usethis_2.2.2

loaded via a namespace (and not attached): [1] RcppAnnoy_0.0.21
[2] splines_4.3.1
[3] later_1.3.1
[4] bitops_1.0-7
[5] tibble_3.2.1
[6] R.oo_1.25.0
[7] polyclip_1.10-6
[8] graph_1.78.0
[9] XML_3.99-0.16
[10] fastDummies_1.7.3
[11] lifecycle_1.0.4
[12] globals_0.16.2
[13] processx_3.8.2
[14] lattice_0.22-5
[15] MASS_7.3-60.0.1
[16] ggdist_3.3.2
[17] magrittr_2.0.3
[18] plotly_4.10.4
[19] remotes_2.4.2.1
[20] httpuv_1.6.12
[21] sctransform_0.4.1
[22] spam_2.10-0
[23] spatstat.sparse_3.0-3
[24] sessioninfo_1.2.2
[25] pkgbuild_1.4.3
[26] reticulate_1.34.0
[27] pbapply_1.7-2
[28] cowplot_1.1.2
[29] DBI_1.2.1
[30] RColorBrewer_1.1-3
[31] abind_1.4-5
[32] pkgload_1.3.4
[33] zlibbioc_1.46.0
[34] Rtsne_0.17
[35] GenomicRanges_1.52.0
[36] purrr_1.0.2
[37] R.utils_2.12.3
[38] BiocGenerics_0.46.0
[39] msigdbr_7.5.1
[40] RCurl_1.98-1.14
[41] GenomeInfoDbData_1.2.10
[42] IRanges_2.34.1
[43] S4Vectors_0.38.2
[44] ggrepel_0.9.5
[45] irlba_2.3.5.1
[46] spatstat.utils_3.0-4
[47] listenv_0.9.0
[48] pheatmap_1.0.12
[49] GSVA_1.53.3
[50] goftest_1.2-3
[51] RSpectra_0.16-1
[52] spatstat.random_3.2-2
[53] annotate_1.78.0
[54] fitdistrplus_1.1-11
[55] parallelly_1.36.0
[56] DelayedMatrixStats_1.22.6
[57] leiden_0.4.3.1
[58] codetools_0.2-19
[59] DelayedArray_0.26.7
[60] tidyselect_1.2.0
[61] farver_2.1.1
[62] UCell_2.9.0
[63] ScaledMatrix_1.8.1
[64] spatstat.explore_3.2-5
[65] matrixStats_1.2.0
[66] stats4_4.3.1
[67] jsonlite_1.8.8
[68] BiocNeighbors_1.18.0
[69] ellipsis_0.3.2
[70] progressr_0.14.0
[71] ggridges_0.5.5
[72] survival_3.5-7
[73] tools_4.3.1
[74] ica_1.0-3
[75] Rcpp_1.0.11
[76] glue_1.6.2
[77] gridExtra_2.3
[78] SparseArray_1.2.4
[79] MatrixGenerics_1.12.3
[80] distributional_0.4.0
[81] GenomeInfoDb_1.36.3
[82] AUCell_1.26.0
[83] dplyr_1.1.3
[84] HDF5Array_1.28.1
[85] withr_3.0.0
[86] fastmap_1.1.1
[87] rhdf5filters_1.12.1
[88] fansi_1.0.4
[89] ggpointdensity_0.1.0
[90] callr_3.7.3
[91] digest_0.6.33
[92] rsvd_1.0.5
[93] R6_2.5.1
[94] mime_0.12
[95] colorspace_2.1-0
[96] scattermore_1.2
[97] tensor_1.5
[98] spatstat.data_3.0-4
[99] RSQLite_2.3.4
[100] R.methodsS3_1.8.2
[101] tidyr_1.3.0
[102] utf8_1.2.3
[103] generics_0.1.3
[104] data.table_1.14.8
[105] httr_1.4.7
[106] htmlwidgets_1.6.4
[107] S4Arrays_1.2.1
[108] uwot_0.1.16
[109] pkgconfig_2.0.3
[110] gtable_0.3.4
[111] blob_1.2.4
[112] lmtest_0.9-40
[113] SingleCellExperiment_1.22.0 [114] XVector_0.40.0
[115] htmltools_0.5.7
[116] profvis_0.3.8
[117] dotCall64_1.1-1
[118] GSEABase_1.62.0
[119] scales_1.3.0
[120] Biobase_2.60.0
[121] png_0.1-8
[122] SpatialExperiment_1.14.0
[123] rstudioapi_0.15.0
[124] reshape2_1.4.4
[125] rjson_0.2.21
[126] nlme_3.1-164
[127] curl_5.0.2
[128] zoo_1.8-12
[129] cachem_1.0.8
[130] rhdf5_2.44.0
[131] stringr_1.5.0
[132] KernSmooth_2.23-22
[133] parallel_4.3.1
[134] miniUI_0.1.1.1
[135] AnnotationDbi_1.62.2
[136] desc_1.4.3
[137] pillar_1.9.0
[138] grid_4.3.1
[139] vctrs_0.6.3
[140] RANN_2.6.1
[141] urlchecker_1.0.1
[142] promises_1.2.1
[143] BiocSingular_1.16.0
[144] beachmat_2.16.0
[145] xtable_1.8-4
[146] cluster_2.1.6
[147] magick_2.8.3
[148] cli_3.6.1
[149] compiler_4.3.1
[150] rlang_1.1.1
[151] crayon_1.5.2
[152] future.apply_1.11.1
[153] labeling_0.4.3
[154] ps_1.7.5
[155] plyr_1.8.9
[156] fs_1.6.3
[157] stringi_1.7.12
[158] deldir_2.0-2
[159] viridisLite_0.4.2
[160] BiocParallel_1.34.2
[161] babelgene_22.9
[162] munsell_0.5.0
[163] Biostrings_2.68.1
[164] lazyeval_0.2.2
[165] spatstat.geom_3.2-7
[166] Matrix_1.6-5
[167] RcppHNSW_0.5.0
[168] patchwork_1.2.0
[169] sparseMatrixStats_1.12.2
[170] bit64_4.0.5
[171] future_1.33.1
[172] Rhdf5lib_1.22.1
[173] KEGGREST_1.40.1
[174] shiny_1.8.0
[175] SummarizedExperiment_1.30.2 [176] ROCR_1.0-11
[177] igraph_1.6.0
[178] memoise_2.0.1
[179] bit_4.0.5

RosyGH commented 5 months ago

library(ggplot2) library(escape) library(Seurat) library(dittoSeq) library(grDevices)

pbmc_small <- get("pbmc_small") sce.pbmc <- as.SingleCellExperiment(pbmc_small, assay = "RNA") GS.hallmark <- getGeneSets(library = "H")

pbmc_small <- runEscape(pbmc_small, method = "ssGSEA", gene.sets = GS.hallmark, groups = 1000, min.size = 5, new.assay.name = "escape.ssGSEA")

heatmapEnrichment(pbmc_small, assay = "escape.ssGSEA", palette = "Spectral")

geyserEnrichment(pbmc_small, assay = "escape.ssGSEA", gene.set = "HALLMARK−TGF−BETA−SIGNALING")

scatterEnrichment(seurat_ex, assay = "escape.ssGSEA", x.axis = "HALLMARK−TGF−BETA−SIGNALING", y.axis = "HALLMARK-WNT-BETA-CATENIN-SIGNALING")

Error in cnts[gene.set, ] : subscript out of bounds

ncborcherding commented 5 months ago

Hey RosyGH,

Thanks for the rundown - I was able to identify the issue. There is an internal filter for runEscape() that requires a minimum number of genes in a gene set. With min.size = 5, "HALLMARK−TGF−BETA−SIGNALING" does not pass the filter.

You can see the gene sets that are calculated with: rownames(pbmc_small@assays$escape.ssGSEA) [1] "HALLMARK-ALLOGRAFT-REJECTION" "HALLMARK-APOPTOSIS"
[3] "HALLMARK-COAGULATION" "HALLMARK-COMPLEMENT"
[5] "HALLMARK-EPITHELIAL-MESENCHYMAL-TRANSITION" "HALLMARK-ESTROGEN-RESPONSE-LATE"
[7] "HALLMARK-FATTY-ACID-METABOLISM" "HALLMARK-HEME-METABOLISM"
[9] "HALLMARK-IL2-STAT5-SIGNALING" "HALLMARK-IL6-JAK-STAT3-SIGNALING"
[11] "HALLMARK-INFLAMMATORY-RESPONSE" "HALLMARK-INTERFERON-GAMMA-RESPONSE"
[13] "HALLMARK-KRAS-SIGNALING-UP" "HALLMARK-MTORC1-SIGNALING"
[15] "HALLMARK-MYOGENESIS" "HALLMARK-P53-PATHWAY"
[17] "HALLMARK-TNFA-SIGNALING-VIA-NFKB"

If we pick one on the list, we can get it to plot:

geyserEnrichment(pbmc_small,
                  assay = "escape.ssGSEA",
                  gene.set = "HALLMARK-TNFA-SIGNALING-VIA-NFKB")
Screenshot 2024-06-21 at 1 05 30 PM

Hope that helps and let me know if you have any other questions, Nick