Plotting manually assigned clusters on tSNE

Tregwiz commented 1 year ago

Hi, Thanks for providing such a great pipeline for flow cytometry analysis. My issue is simpler to a previous post: https://github.com/HelenaLC/CATALYST/issues/174 . Apologies if this issue would be better placed in that thread, I was unsure whether you would see any further comments as that thread is marked as closed. Basically, I completed FLOWSOM clustering usually Catalyst and from the 18 clusters I obtained, I reduced these to 10 by manually assigning specific clusters to the same new cluster:

old_cluster <- c("1", "2", "3", "4","5", "6", "7", "8","9", "10", "11", "12","13", "14", "15", "16","17", "18") new_cluster <- c("a", "a", "b", "b","c", "d", "e", "a","f", "g", "d", "h","i", "g", "i", "i","c", "j") df <- data.frame(old_cluster, new_cluster) sce <- mergeClusters(sce, k = "meta18", table = df, id = "merge10")

I am however unsure how to visualise these on the tSNE plot using plotDR, as I am unsure what to write in the color_by= "" argument.

Any help would be greatly appreciated. Thanks.

HelenaLC commented 1 year ago

plotDR(sce, color_by=“cluster_id”, k=“merge10”) should do the trick. Have you read the function documentation? If it’s not clear from there, I’m open to feedback and improving it, thanks.

Tregwiz commented 1 year ago

Hi Helena, Thanks a lot for your prompt reply. Unfortunately that is the code I had tried previously, however it returns the following: Error in plotDR(sce, color_by = "cluster_id", k = "merge_10") : length(reducedDims(x)) != 0 is not TRUE

HelenaLC commented 1 year ago

Well, you need to runDR() first to get the t-SNE… sry, was assuming that was clear.

Tregwiz commented 1 year ago

Hi Helena, Thanks for your reply. Sorry I should have been clearer, I have previously done runDR() to generate the t-SNE, and can plot the FLOWSOM generated clusters on here using plotDR(sce, color_by = "meta18"), however when I try to plot the manually merged clusters using plotDR(sce, color_by = "cluster_id", k = "merge_10") it returns length(reducedDims(x)) != 0 is not TRUE . Best wishes, Gabriel

HelenaLC commented 1 year ago

Aha, I see, thanks for clarifying! What would really help is: could you please post the output of the following commands: sce and names(cluster_codes(sce)) (right before the error occurs), thanks!

Tregwiz commented 1 year ago

Hi Helena, sce returns the following:

class: SingleCellExperiment dim: 15 407149 metadata(5): experiment_info chs_by_fcs cluster_codes SOM_codes delta_area assays(2): counts exprs rownames(15): CD64 PDL1 ... CD206 CD86 rowData names(4): channel_name marker_name marker_class used_for_clustering colnames: NULL colData names(4): sample_id condition patient_id cluster_id reducedDimNames(0): mainExpName: NULL altExpNames(0):

names(cluster_codes(sce)) returns (apologies for having a few different merging names):

[1] "som100" "meta2" "meta3" "meta4" "meta5" "meta6"
[7] "meta7" "meta8" "meta9" "meta10" "meta11" "meta12"
[13] "meta13" "meta14" "meta15" "meta16" "meta17" "meta18"
[19] "meta19" "meta20" "merging1" "merging" "merging2" "merge"
[25] "merge10"

HelenaLC commented 1 year ago

Yes, so as you can see, the reducedDimNames slot is empty, meaning there are no dimensionality reductions in the object, hence the error you're getting. Here's a couple things I'd like you to do/try:

Please post the output of your sessionInfo() with every issue, so we can rule out version-specific causes that might have already been resolved.
Are you absolutely positive you ran runDR()? If so, can you check if there are reduced dimensions in the SCE before running mergeClusters()? Because if they were there and are dropped, that'd indeed be a bug.
If none of these leads to a hint towards solving this, could you post the code you ran up to this point? Ideally only relevant parts, e.g., prepData(), cluster(), runDR(), mergeClusters() and plotDR() (but the full code you have in your script, I mean).

Tregwiz commented 1 year ago

Hi Helena, Really sorry there was actually an error on my part previously. I had accidentally renamed my sce as tsne by running tsne<- runDR(sce, dr = "TSNE", cells = 2100, features ="type")

So tsne contains the dimensionality reduction rather than sce.

tsne returns:

class: SingleCellExperiment dim: 15 407149 metadata(5): experiment_info chs_by_fcs cluster_codes SOM_codes delta_area assays(2): counts exprs rownames(15): CD64 PDL1 ... CD206 CD86 rowData names(4): channel_name marker_name marker_class used_for_clustering colnames: NULL colData names(4): sample_id condition patient_id cluster_id reducedDimNames(1): TSNE mainExpName: NULL altExpNames(0):

names(cluster_codes(tsne)) returns:

[1] "som100" "meta2" "meta3" "meta4" "meta5" "meta6" "meta7" "meta8" "meta9" "meta10" [11] "meta11" "meta12" "meta13" "meta14" "meta15" "meta16" "meta17" "meta18" "meta19" "meta20" [21] "merge10"

Now running plotDR(tsne, color_by ="cluster_id", k="merge10"), it returns the tSNE plot however it plots 100 different cluster IDs, rather than the 10 (from "merge10") I was expecting:

Tregwiz commented 1 year ago

Also this is my session info:

R version 4.2.1 (2022-06-23) Platform: x86_64-apple-darwin17.0 (64-bit) Running under: macOS Monterey 12.5.1

Matrix products: default LAPACK: /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRlapack.dylib

locale: [1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8

attached base packages: [1] stats4 stats graphics grDevices utils datasets methods base

other attached packages: [1] ggpubr_0.4.0 ggsignif_0.6.4 readxl_1.4.1
[4] destiny_3.10.0 uwot_0.1.14 umap_0.2.9.0
[7] scater_1.24.0 ggplot2_3.3.6 scuttle_1.6.3
[10] diffcyt_1.16.0 flowCore_2.8.0 cowplot_1.1.1
[13] carData_3.0-5 CATALYST_1.20.1 SingleCellExperiment_1.18.1 [16] SummarizedExperiment_1.26.1 Biobase_2.56.0 GenomicRanges_1.48.0
[19] GenomeInfoDb_1.32.4 IRanges_2.30.1 S4Vectors_0.34.0
[22] BiocGenerics_0.42.0 MatrixGenerics_1.8.1 matrixStats_0.62.0
[25] Matrix_1.5-1

loaded via a namespace (and not attached): [1] rsvd_1.0.5 vcd_1.4-10 class_7.3-20
[4] foreach_1.5.2 lmtest_0.9-40 crayon_1.5.2
[7] laeken_0.5.2 MASS_7.3-58.1 nlme_3.1-160
[10] backports_1.4.1 rlang_1.0.6 XVector_0.36.0
[13] irlba_2.3.5.1 nloptr_2.0.3 limma_3.52.4
[16] smoother_1.1 BiocParallel_1.30.4 rjson_0.2.21
[19] glue_1.6.2 pheatmap_1.0.12 parallel_4.2.1
[22] vipor_0.4.5 tidyselect_1.2.0 XML_3.99-0.12
[25] tidyr_1.2.1 zoo_1.8-11 nnls_1.4
[28] RcppHNSW_0.4.1 magrittr_2.0.3 evaluate_0.17
[31] cli_3.4.1 zlibbioc_1.42.0 rstudioapi_0.14
[34] sp_1.5-0 bslib_0.4.1 RcppEigen_0.3.3.9.2
[37] BiocSingular_1.12.0 xfun_0.34 askpass_1.1
[40] clue_0.3-62 cluster_2.1.4 pcaMethods_1.88.0
[43] tibble_3.1.8 ggrepel_0.9.1 png_0.1-7
[46] withr_2.5.0 bitops_1.0-7 aws.signature_0.6.0
[49] ggforce_0.4.1 RBGL_1.72.0 ranger_0.14.1
[52] plyr_1.8.7 cellranger_1.1.0 ncdfFlow_2.42.1
[55] e1071_1.7-12 pillar_1.8.1 RcppParallel_5.1.5
[58] GlobalOptions_0.1.2 cachem_1.0.6 multcomp_1.4-20
[61] scatterplot3d_0.3-42 CytoML_2.8.1 TTR_0.24.3
[64] GetoptLong_1.0.5 DelayedMatrixStats_1.18.2 xts_0.12.2
[67] vctrs_0.5.0 generics_0.1.3 tools_4.2.1
[70] beeswarm_0.4.0 munsell_0.5.0 tweenr_2.0.2
[73] aws.s3_0.3.21 proxy_0.4-27 DelayedArray_0.22.0
[76] fastmap_1.1.0 compiler_4.2.1 abind_1.4-5
[79] GenomeInfoDbData_1.2.8 gridExtra_2.3 edgeR_3.38.4
[82] lattice_0.20-45 ggnewscale_0.4.8 ggpointdensity_0.1.0
[85] deldir_1.0-6 utf8_1.2.2 dplyr_1.0.10
[88] jsonlite_1.8.3 ggplot.multistats_1.0.0 scales_1.2.1
[91] graph_1.74.0 ScaledMatrix_1.4.1 sparseMatrixStats_1.8.0
[94] car_3.1-1 doParallel_1.0.17 latticeExtra_0.6-30
[97] reticulate_1.26 rmarkdown_2.17 sandwich_3.0-2
[100] Rtsne_0.16 igraph_1.3.5 survival_3.4-0
[103] yaml_2.3.6 plotrix_3.8-2 cytolib_2.8.0
[106] flowWorkspace_4.8.0 htmltools_0.5.3 locfit_1.5-9.6
[109] viridisLite_0.4.1 digest_0.6.30 assertthat_0.2.1
[112] remotes_2.4.2 data.table_1.14.4 drc_3.0-1
[115] splines_4.2.1 labeling_0.4.2 ggsci_2.9
[118] RCurl_1.98-1.9 broom_1.0.1 colorspace_2.0-3
[121] ConsensusClusterPlus_1.60.0 base64enc_0.1-3 BiocManager_1.30.19
[124] ggbeeswarm_0.6.0 shape_1.4.6 nnet_7.3-18
[127] sass_0.4.2 Rcpp_1.0.9 mvtnorm_1.1-3
[130] circlize_0.4.15 FlowSOM_2.4.0 RProtoBufLib_2.8.0
[133] fansi_1.0.3 VIM_6.2.2 R6_2.5.1
[136] grid_4.2.1 ggridges_0.5.4 lifecycle_1.0.3
[139] curl_4.3.3 minqa_1.2.5 jquerylib_0.1.4
[142] robustbase_0.95-0 TH.data_1.1-1 RColorBrewer_1.1-3
[145] iterators_1.0.14 stringr_1.4.1 beachmat_2.12.0
[148] polyclip_1.10-4 purrr_0.3.5 ComplexHeatmap_2.12.1
[151] openssl_2.0.4 codetools_0.2-18 gtools_3.9.3
[154] RSpectra_0.16-1 gtable_0.3.1 DBI_1.1.3
[157] httr_1.4.4 highr_0.9 stringi_1.7.8
[160] reshape2_1.4.4 farver_2.1.1 viridis_0.6.2
[163] ggthemes_4.2.4 hexbin_1.28.2 Rgraphviz_2.40.0
[166] xml2_1.3.3 colorRamps_2.3.1 ggcyto_1.24.1
[169] boot_1.3-28 BiocNeighbors_1.14.0 lme4_1.1-31
[172] interp_1.1-3 scattermore_0.8 DEoptimR_1.0-11
[175] jpeg_0.1-9 pkgconfig_2.0.3 rstatix_0.7.0
[178] knitr_1.40

HelenaLC commented 1 year ago

Okay, so we're making progress. Could you post the table you used for merging? And, maybe the output of table(clsuter_codes(tsne, "merge10"))?

Tregwiz commented 1 year ago

old_cluster <- c("1", "2", "3", "4","5", "6", "7", "8","9", "10", "11", "12","13", "14", "15", "16","17", "18") new_cluster <- c("a", "a", "b", "b","c", "d", "e", "a","f", "g", "d", "h","i", "g", "i", "i","c", "j") df <- data.frame(old_cluster, new_cluster)

tsne <- mergeClusters(tsne, k = "meta18", table = df, id = "merge10")

For table(cluster_codes(tsne, "merge10")) , this doesn't return anything - did you mean table(cluster_ids(tsne, k = "merge10")) ? This returns:

a      b      c      d      e      f      g      h      i      j

8191 3987 4658 28965 66522 225220 476 30638 27291 11201

HelenaLC commented 1 year ago

Aha, finally got it! The plot above helped a lot, because CATALYST rarely uses default colors; there's a custom and consistent color palette for clusters across functions. I made a mistake earlier in that plotDR does not have a k argument (see ?plotDR). Instead, the clustering identifier should be passed directly to color_by, i.e., plotDR(sce, color_by = "merge10") and that's it :)

Tregwiz commented 1 year ago

Hi Helena, Thanks for your message. Yes this has solved it. Sorry I feel a bit silly as this was the original code line I used but with sce as x instead of tsne (which contained the dimensionality reduction). At least we got there in the end - thanks a lot for your help.

HelenaLC / CATALYST

Plotting manually assigned clusters on tSNE #311