HelenaLC / CATALYST

Cytometry dATa anALYsis Tools
67 stars 30 forks source link

Plotting manually assigned clusters on tSNE #311

Closed Tregwiz closed 1 year ago

Tregwiz commented 1 year ago

Hi, Thanks for providing such a great pipeline for flow cytometry analysis. My issue is simpler to a previous post: https://github.com/HelenaLC/CATALYST/issues/174 . Apologies if this issue would be better placed in that thread, I was unsure whether you would see any further comments as that thread is marked as closed. Basically, I completed FLOWSOM clustering usually Catalyst and from the 18 clusters I obtained, I reduced these to 10 by manually assigning specific clusters to the same new cluster:

old_cluster <- c("1", "2", "3", "4","5", "6", "7", "8","9", "10", "11", "12","13", "14", "15", "16","17", "18") new_cluster <- c("a", "a", "b", "b","c", "d", "e", "a","f", "g", "d", "h","i", "g", "i", "i","c", "j") df <- data.frame(old_cluster, new_cluster) sce <- mergeClusters(sce, k = "meta18", table = df, id = "merge10")

I am however unsure how to visualise these on the tSNE plot using plotDR, as I am unsure what to write in the color_by= "" argument.

Any help would be greatly appreciated. Thanks.

HelenaLC commented 1 year ago

plotDR(sce, color_by=“cluster_id”, k=“merge10”) should do the trick. Have you read the function documentation? If it’s not clear from there, I’m open to feedback and improving it, thanks.

Tregwiz commented 1 year ago

Hi Helena, Thanks a lot for your prompt reply. Unfortunately that is the code I had tried previously, however it returns the following: Error in plotDR(sce, color_by = "cluster_id", k = "merge_10") : length(reducedDims(x)) != 0 is not TRUE

HelenaLC commented 1 year ago

Well, you need to runDR() first to get the t-SNE… sry, was assuming that was clear.

Tregwiz commented 1 year ago

Hi Helena, Thanks for your reply. Sorry I should have been clearer, I have previously done runDR() to generate the t-SNE, and can plot the FLOWSOM generated clusters on here using plotDR(sce, color_by = "meta18"), however when I try to plot the manually merged clusters using plotDR(sce, color_by = "cluster_id", k = "merge_10") it returns length(reducedDims(x)) != 0 is not TRUE . Best wishes, Gabriel

HelenaLC commented 1 year ago

Aha, I see, thanks for clarifying! What would really help is: could you please post the output of the following commands: sce and names(cluster_codes(sce)) (right before the error occurs), thanks!

Tregwiz commented 1 year ago

Hi Helena, sce returns the following:

class: SingleCellExperiment dim: 15 407149 metadata(5): experiment_info chs_by_fcs cluster_codes SOM_codes delta_area assays(2): counts exprs rownames(15): CD64 PDL1 ... CD206 CD86 rowData names(4): channel_name marker_name marker_class used_for_clustering colnames: NULL colData names(4): sample_id condition patient_id cluster_id reducedDimNames(0): mainExpName: NULL altExpNames(0):

names(cluster_codes(sce)) returns (apologies for having a few different merging names):

[1] "som100" "meta2" "meta3" "meta4" "meta5" "meta6"
[7] "meta7" "meta8" "meta9" "meta10" "meta11" "meta12"
[13] "meta13" "meta14" "meta15" "meta16" "meta17" "meta18"
[19] "meta19" "meta20" "merging1" "merging" "merging2" "merge"
[25] "merge10"

HelenaLC commented 1 year ago

Yes, so as you can see, the reducedDimNames slot is empty, meaning there are no dimensionality reductions in the object, hence the error you're getting. Here's a couple things I'd like you to do/try:

Tregwiz commented 1 year ago

Hi Helena, Really sorry there was actually an error on my part previously. I had accidentally renamed my sce as tsne by running tsne<- runDR(sce, dr = "TSNE", cells = 2100, features ="type")

So tsne contains the dimensionality reduction rather than sce.

tsne returns:

class: SingleCellExperiment dim: 15 407149 metadata(5): experiment_info chs_by_fcs cluster_codes SOM_codes delta_area assays(2): counts exprs rownames(15): CD64 PDL1 ... CD206 CD86 rowData names(4): channel_name marker_name marker_class used_for_clustering colnames: NULL colData names(4): sample_id condition patient_id cluster_id reducedDimNames(1): TSNE mainExpName: NULL altExpNames(0):

names(cluster_codes(tsne)) returns:

[1] "som100" "meta2" "meta3" "meta4" "meta5" "meta6" "meta7" "meta8" "meta9" "meta10" [11] "meta11" "meta12" "meta13" "meta14" "meta15" "meta16" "meta17" "meta18" "meta19" "meta20" [21] "merge10"

Now running plotDR(tsne, color_by ="cluster_id", k="merge10"), it returns the tSNE plot however it plots 100 different cluster IDs, rather than the 10 (from "merge10") I was expecting:

image

Tregwiz commented 1 year ago

Also this is my session info:

R version 4.2.1 (2022-06-23) Platform: x86_64-apple-darwin17.0 (64-bit) Running under: macOS Monterey 12.5.1

Matrix products: default LAPACK: /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRlapack.dylib

locale: [1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8

attached base packages: [1] stats4 stats graphics grDevices utils datasets methods base

other attached packages: [1] ggpubr_0.4.0 ggsignif_0.6.4 readxl_1.4.1
[4] destiny_3.10.0 uwot_0.1.14 umap_0.2.9.0
[7] scater_1.24.0 ggplot2_3.3.6 scuttle_1.6.3
[10] diffcyt_1.16.0 flowCore_2.8.0 cowplot_1.1.1
[13] carData_3.0-5 CATALYST_1.20.1 SingleCellExperiment_1.18.1 [16] SummarizedExperiment_1.26.1 Biobase_2.56.0 GenomicRanges_1.48.0
[19] GenomeInfoDb_1.32.4 IRanges_2.30.1 S4Vectors_0.34.0
[22] BiocGenerics_0.42.0 MatrixGenerics_1.8.1 matrixStats_0.62.0
[25] Matrix_1.5-1

loaded via a namespace (and not attached): [1] rsvd_1.0.5 vcd_1.4-10 class_7.3-20
[4] foreach_1.5.2 lmtest_0.9-40 crayon_1.5.2
[7] laeken_0.5.2 MASS_7.3-58.1 nlme_3.1-160
[10] backports_1.4.1 rlang_1.0.6 XVector_0.36.0
[13] irlba_2.3.5.1 nloptr_2.0.3 limma_3.52.4
[16] smoother_1.1 BiocParallel_1.30.4 rjson_0.2.21
[19] glue_1.6.2 pheatmap_1.0.12 parallel_4.2.1
[22] vipor_0.4.5 tidyselect_1.2.0 XML_3.99-0.12
[25] tidyr_1.2.1 zoo_1.8-11 nnls_1.4
[28] RcppHNSW_0.4.1 magrittr_2.0.3 evaluate_0.17
[31] cli_3.4.1 zlibbioc_1.42.0 rstudioapi_0.14
[34] sp_1.5-0 bslib_0.4.1 RcppEigen_0.3.3.9.2
[37] BiocSingular_1.12.0 xfun_0.34 askpass_1.1
[40] clue_0.3-62 cluster_2.1.4 pcaMethods_1.88.0
[43] tibble_3.1.8 ggrepel_0.9.1 png_0.1-7
[46] withr_2.5.0 bitops_1.0-7 aws.signature_0.6.0
[49] ggforce_0.4.1 RBGL_1.72.0 ranger_0.14.1
[52] plyr_1.8.7 cellranger_1.1.0 ncdfFlow_2.42.1
[55] e1071_1.7-12 pillar_1.8.1 RcppParallel_5.1.5
[58] GlobalOptions_0.1.2 cachem_1.0.6 multcomp_1.4-20
[61] scatterplot3d_0.3-42 CytoML_2.8.1 TTR_0.24.3
[64] GetoptLong_1.0.5 DelayedMatrixStats_1.18.2 xts_0.12.2
[67] vctrs_0.5.0 generics_0.1.3 tools_4.2.1
[70] beeswarm_0.4.0 munsell_0.5.0 tweenr_2.0.2
[73] aws.s3_0.3.21 proxy_0.4-27 DelayedArray_0.22.0
[76] fastmap_1.1.0 compiler_4.2.1 abind_1.4-5
[79] GenomeInfoDbData_1.2.8 gridExtra_2.3 edgeR_3.38.4
[82] lattice_0.20-45 ggnewscale_0.4.8 ggpointdensity_0.1.0
[85] deldir_1.0-6 utf8_1.2.2 dplyr_1.0.10
[88] jsonlite_1.8.3 ggplot.multistats_1.0.0 scales_1.2.1
[91] graph_1.74.0 ScaledMatrix_1.4.1 sparseMatrixStats_1.8.0
[94] car_3.1-1 doParallel_1.0.17 latticeExtra_0.6-30
[97] reticulate_1.26 rmarkdown_2.17 sandwich_3.0-2
[100] Rtsne_0.16 igraph_1.3.5 survival_3.4-0
[103] yaml_2.3.6 plotrix_3.8-2 cytolib_2.8.0
[106] flowWorkspace_4.8.0 htmltools_0.5.3 locfit_1.5-9.6
[109] viridisLite_0.4.1 digest_0.6.30 assertthat_0.2.1
[112] remotes_2.4.2 data.table_1.14.4 drc_3.0-1
[115] splines_4.2.1 labeling_0.4.2 ggsci_2.9
[118] RCurl_1.98-1.9 broom_1.0.1 colorspace_2.0-3
[121] ConsensusClusterPlus_1.60.0 base64enc_0.1-3 BiocManager_1.30.19
[124] ggbeeswarm_0.6.0 shape_1.4.6 nnet_7.3-18
[127] sass_0.4.2 Rcpp_1.0.9 mvtnorm_1.1-3
[130] circlize_0.4.15 FlowSOM_2.4.0 RProtoBufLib_2.8.0
[133] fansi_1.0.3 VIM_6.2.2 R6_2.5.1
[136] grid_4.2.1 ggridges_0.5.4 lifecycle_1.0.3
[139] curl_4.3.3 minqa_1.2.5 jquerylib_0.1.4
[142] robustbase_0.95-0 TH.data_1.1-1 RColorBrewer_1.1-3
[145] iterators_1.0.14 stringr_1.4.1 beachmat_2.12.0
[148] polyclip_1.10-4 purrr_0.3.5 ComplexHeatmap_2.12.1
[151] openssl_2.0.4 codetools_0.2-18 gtools_3.9.3
[154] RSpectra_0.16-1 gtable_0.3.1 DBI_1.1.3
[157] httr_1.4.4 highr_0.9 stringi_1.7.8
[160] reshape2_1.4.4 farver_2.1.1 viridis_0.6.2
[163] ggthemes_4.2.4 hexbin_1.28.2 Rgraphviz_2.40.0
[166] xml2_1.3.3 colorRamps_2.3.1 ggcyto_1.24.1
[169] boot_1.3-28 BiocNeighbors_1.14.0 lme4_1.1-31
[172] interp_1.1-3 scattermore_0.8 DEoptimR_1.0-11
[175] jpeg_0.1-9 pkgconfig_2.0.3 rstatix_0.7.0
[178] knitr_1.40

HelenaLC commented 1 year ago

Okay, so we're making progress. Could you post the table you used for merging? And, maybe the output of table(clsuter_codes(tsne, "merge10"))?

Tregwiz commented 1 year ago

old_cluster <- c("1", "2", "3", "4","5", "6", "7", "8","9", "10", "11", "12","13", "14", "15", "16","17", "18") new_cluster <- c("a", "a", "b", "b","c", "d", "e", "a","f", "g", "d", "h","i", "g", "i", "i","c", "j") df <- data.frame(old_cluster, new_cluster)

tsne <- mergeClusters(tsne, k = "meta18", table = df, id = "merge10")

For table(cluster_codes(tsne, "merge10")) , this doesn't return anything - did you mean table(cluster_ids(tsne, k = "merge10")) ? This returns:

a      b      c      d      e      f      g      h      i      j 

8191 3987 4658 28965 66522 225220 476 30638 27291 11201

HelenaLC commented 1 year ago

Aha, finally got it! The plot above helped a lot, because CATALYST rarely uses default colors; there's a custom and consistent color palette for clusters across functions. I made a mistake earlier in that plotDR does not have a k argument (see ?plotDR). Instead, the clustering identifier should be passed directly to color_by, i.e., plotDR(sce, color_by = "merge10") and that's it :)

Tregwiz commented 1 year ago

Hi Helena, Thanks for your message. Yes this has solved it. Sorry I feel a bit silly as this was the original code line I used but with sce as x instead of tsne (which contained the dimensionality reduction). At least we got there in the end - thanks a lot for your help.