markrobinsonuzh / cytofWorkflow

MIT License
14 stars 3 forks source link

Order and labeling of clusters from most to least abundant #14

Closed CNicholasMDA closed 4 years ago

CNicholasMDA commented 4 years ago

Hi, How feasible would it be to modify the script in order to rank the clusters in order from most to least abundant, and instruct the program to number and color them in that order? with thanks, Courtney

HelenaLC commented 4 years ago

Dear Courney, The simplest way I can think of is to overwrite the existing cluster IDs with a reordered version- ordered by the number of cells in each cluster. Here's a minimal example:

library(CATALYST)
sce <- prepData(PBMC_fs, PBMC_panel, PBMC_md)
sce <- cluster(sce)

ncs <- table(cluster_ids(sce, "meta20"))
tbl <- data.frame(
    old = order(ncs, decreasing = TRUE),
    new = seq_len(20))

sce <- mergeClusters(sce, 
    k = "meta20", id = "meta20", 
    table = tbl, overwrite = TRUE)

plotExprHeatmap(sce, by = "cluster_id", k = "meta20", 
    row_clust = FALSE, bars = TRUE, perc = TRUE)

To do the above for all available clusterings:

for (k in names(cluster_codes(sce))) {
    kids <- cluster_ids(sce, k)
    tbl <- data.frame(
        old = order(table(kids), decreasing = TRUE),
        new = seq_len(nlevels(kids)))
    sce <- mergeClusters(sce, k = k, id = k, table = tbl, overwrite = TRUE)
}

Note: This is for the new Bioc 3.11 release version of CATALYST, which will become available today. Previously, there was no way to turn off the row clustering, and the original cluster order could not be retained.

image