jokergoo / ComplexHeatmap

Make Complex Heatmaps
https://jokergoo.github.io/ComplexHeatmap-reference/book/
Other
1.3k stars 230 forks source link

Programmatically label blocks using anno_block() based on row_split data.frame #705

Closed malisas closed 3 years ago

malisas commented 3 years ago

I am passing a data.frame to the row_split argument of ComplexHeatmap::Heatmap(). In order to label the resulting blocks, I am using anno_block()'s label argument. (See the first working example below)

I would like to eventually call Heatmap() programmatically, meaning that the exact contents of row_split_df will differ each time in the number and names of columns. The issue I'm having is that anno_block() is passed to rowAnnotation() via ellipses ..., so the foo and bar anno_blocks have to be specified one-by-one. This makes it difficult to programatically label the blocks.

As a work-around, I tried to use do.call to call rowAnnotation with a list of anno_block()s (see the second code example), but this fails.

Do you have ideas about
1) how to tweak my do.call call so that foo and bar get recognized as names, or 2) an alternative to anno_block() for labeling blocks that works programmatically?

I don't understand do.call and ellipses very well, so there may be a simple fix that involves quoting/evaluation/substitution. I can't seem to figure it out, though.

Thank you! This package is great!

library(ComplexHeatmap)
#> Loading required package: grid
#> ========================================
#> ComplexHeatmap version 2.6.2
#> Bioconductor page: http://bioconductor.org/packages/ComplexHeatmap/
#> Github page: https://github.com/jokergoo/ComplexHeatmap
#> Documentation: http://jokergoo.github.io/ComplexHeatmap-reference
#> 
#> If you use it in published research, please cite:
#> Gu, Z. Complex heatmaps reveal patterns and correlations in multidimensional 
#>   genomic data. Bioinformatics 2016.
#> 
#> This message can be suppressed by:
#>   suppressPackageStartupMessages(library(ComplexHeatmap))
#> ========================================
row_split_df <- data.frame(foo = rep(1:2, each=5),
                           bar = c(rep(1, 3), rep(2, 2), rep(1, 3), rep(2, 2)))

# This works
Heatmap(matrix(rnorm(100), 10),
        cluster_rows = F, show_row_dend = FALSE, cluster_columns = FALSE,
        row_split = row_split_df,
        right_annotation = rowAnnotation(foo =
                                           anno_block(gp = gpar(fill=rep(c("#FDBF6F", "#B15928"), each = 2), col="white"),
                                                      labels = rep(c("Group_1", "Group_2"), each = 2),
                                                      labels_gp = gpar(col = "black", fontsize=12)),
                                         bar =
                                           anno_block(gp = gpar(fill=rep(c("#8BA2D4", "#BB8BD4"), times = 2), col="white"),
                                                      labels = rep(c("Group_A", "Group_B"), times = 2),
                                                      labels_gp = gpar(col = "black", fontsize=12))))


# This doesn't work
# Using do.call with rowAnnotation
anno_block_list <- list(foo =
                          anno_block(gp = gpar(fill=rep(c("#FDBF6F", "#B15928"), each = 2), col="white"),
                                     labels = rep(c("Group_1", "Group_2"), each = 2),
                                     labels_gp = gpar(col = "black", fontsize=12)),
                        bar =
                          anno_block(gp = gpar(fill=rep(c("#8BA2D4", "#BB8BD4"), times = 2), col="white"),
                                     labels = rep(c("Group_A", "Group_B"), times = 2),
                                     labels_gp = gpar(col = "black", fontsize=12)))
Heatmap(matrix(rnorm(100), 10),
        cluster_rows = F, show_row_dend = FALSE, cluster_columns = FALSE,
        row_split = row_split_df,
        right_annotation = do.call(rowAnnotation, anno_block_list))
#> Error: annotations should have names.

do.call(rowAnnotation, anno_block_list)
#> Error: annotations should have names.

Created on 2021-03-10 by the reprex package (v1.0.0)

jokergoo commented 3 years ago

Hi, I haven't considered the use of do.call(rowAnnotation, ...) yet, so that is a bug.

Instead, you can directly use HeatmapAnnotation():

anno_block_list <- list(foo =
                          anno_block(gp = gpar(fill=rep(c("#FDBF6F", "#B15928"), each = 2), col="white"),
                                     labels = rep(c("Group_1", "Group_2"), each = 2),
                                     labels_gp = gpar(col = "black", fontsize=12), which = "row"),
                        bar =
                          anno_block(gp = gpar(fill=rep(c("#8BA2D4", "#BB8BD4"), times = 2), col="white"),
                                     labels = rep(c("Group_A", "Group_B"), times = 2),
                                     labels_gp = gpar(col = "black", fontsize=12), which = "row"),
                      which = "row")
do.call(HeatmapAnnotation, anno_block_list)

But you need to manually add several which = "row".

I will fix this problem soon.

Thanks!

jokergoo commented 3 years ago

Another thing in your example, I saw you set cluster_rows = FALSE. Because labels and other related settings in anno_block() always correspond to the heatmap slices from top to bottom after the heatmap is generated, which means, it is difficult to map the labels in anno_block() to the heatmap slices if clustering is applied, in other words, if you set cluster_rows = TRUE, the labels in anno_block() will be wrongly corresponded to the heatmaps, or you need to read the heatmap again and manually adjust the order of labels in anno_block().

To solve this problem and to make it possible to control label programmatically, I add a new argument graphics in anno_block(). The value for graphics should be a self-defined function with two argument: 1: row indices in current slice, and 2: the levels from the splitting variable in current slice. The graphics function will be executed to every slice.

Following are examples:

col = c("1" = "red", "2" = "blue", "A" = "green", "B" = "orange")
Heatmap(matrix(rnorm(100), 10), row_km = 2, row_split = sample(c("A", "B"), 10, replace = TRUE)) + 
rowAnnotation(foo = anno_block(
    graphics = function(index, levels) {
        grid.rect(gp = gpar(fill = col[levels[2]], col = "black"))
        grid.text(paste(levels, collapse = ","), 0.5, 0.5, rot = 90,
            gp = gpar(col = col[levels[1]]))
    }
))

image

We can define a mapping variable, then the labels of the splitting levels can be changed:

labels = c("1" = "one", "2" = "two", "A" = "Group_A", "B" = "Group_B")
Heatmap(matrix(rnorm(100), 10), row_km = 2, row_split = sample(c("A", "B"), 10, replace = TRUE)) + 
rowAnnotation(foo = anno_block(
    graphics = function(index, levels) {
        grid.rect(gp = gpar(fill = col[levels[2]], col = "black"))
        grid.text(paste(labels[levels], collapse = ","), 0.5, 0.5, rot = 90,
            gp = gpar(col = col[levels[1]]))
    }
))

image

As you can see, the labels in anno_block() always correspond to the correct heatmap slice if clustering is appled.

You can also construct more complex labels:

Heatmap(matrix(rnorm(100), 10), row_km = 2, row_split = sample(c("A", "B"), 10, replace = TRUE)) + 
rowAnnotation(foo = anno_block(
    graphics = function(index, levels) {
        grid.rect(gp = gpar(fill = col[levels[2]], col = "black"))
        txt = paste(levels, collapse = ",")
        txt = paste0(txt, "\n", length(index), " rows")
        grid.text(txt, 0.5, 0.5, rot = 0,
            gp = gpar(col = col[levels[1]]))
    },
    width = unit(3, "cm")
))

image

One drawback of setting graphics is now anno_block() does not know what graphics are to be drawn, so it won't automatically calculate the space for the graphics, which means, if the graphics are too large, you need to manually set width argument (this actually can be done programmatically).

Last, for your example, it would be:

col1 = c("1" = "#FDBF6F", "2" = "#B15928")
col2 = c("1" = "#8BA2D4", "2" = "#BB8BD4")
label1 = c("1" = "Group_1", "Group_2")
label2 = c("1" = "Group_A", "Group_B")

block1 = anno_block(
    graphics = function(index, levels) {
        grid.rect(gp = gpar(fill = col1[levels[1]], col = "black"))
        grid.text(label1[levels[1]], 0.5, 0.5, rot = 90)
    }, which = "row"
)
block2 = anno_block(
    graphics = function(index, levels) {
        grid.rect(gp = gpar(fill = col2[levels[1]], col = "black"))
        grid.text(label2[levels[1]], 0.5, 0.5, rot = 90)
    }, which = "row"
)

Heatmap(matrix(rnorm(100), 10),
        cluster_columns = FALSE,
        row_split = row_split_df,
        right_annotation = rowAnnotation(foo = block1, bar = block2)
)

image

Note here anno_block() is not directly specified inside rowAnnotation(), you need to explicitly add which = "row" (I cannot remember exactly, but to be safe, it is better to specify it). Now you only need to dynamically change your row_split_df and maybe also col1, col2, label1 and label2 and other code can be kept unchanged.

malisas commented 3 years ago

Hi @jokergoo , thank you very much for your thorough response!

do.call(HeatmapAnnotation, anno_block_list) will work great for my current task. I also appreciate the additional examples you provide using the graphics argument, for additional flexibility over the block formatting.

I will leave this issue open since you mentioned you might be fixing the minor bug -- but please feel free to close it if you like, since my problem is solved.

jokergoo commented 3 years ago

Em.. This is interesting

do.call(rowAnnotation, anno_block_list)

cause the error, while

do.call("rowAnnotation", anno_block_list)

does not.

jokergoo commented 3 years ago

I have fixed this bug:

> anno_block_list <- list(foo =
+                           anno_block(gp = gpar(fill=rep(c("#FDBF6F", "#B15928"), each = 2), col="white"),
+                                      labels = rep(c("Group_1", "Group_2"), each = 2),
+                                      labels_gp = gpar(col = "black", fontsize=12), which = "row"),
+                         bar =
+                           anno_block(gp = gpar(fill=rep(c("#8BA2D4", "#BB8BD4"), times = 2), col="white"),
+                                      labels = rep(c("Group_A", "Group_B"), times = 2),
+                                      labels_gp = gpar(col = "black", fontsize=12), which = "row"))
> do.call(rowAnnotation, anno_block_list)
A HeatmapAnnotation object with 2 annotations
  name: heatmap_annotation_0
  position: row
  items: unknown
  width: 16.4305264701813mm
  height: 1npc
  this object is subsetable

 name annotation_type color_mapping              width
  foo    anno_block()               8.03953333333334mm
  bar    anno_block()               8.03953333333334mm