Closed crazyhottommy closed 3 years ago
ComplexHeatmap only supports splitting heatmap by rows because you can simply split the matrix by columns and concatenate each submatrix afterwards.
For your situation, I think you should do k-means clustering on the complete matrix beforehand and assign to split
argument later.
mat
mat1 = mat[, 1:3]
mat2 = mat[, 4:6]
mat3 = mat[, 7:9]
km = kmeans(mat, centers = 3)$cluster
row_order = hclust(mat)$order
Heamtap(mat1, row_order = row_order, cluster_rows = FALSE, split = cluster) +
Heatmap(mat2) +
Heatmap(mat3)
Great, thanks for the tip!
Ming
Hi,
I realize that when combine multiple heatmap together, the dendrogram for each individual heatmap is not connected.
I want to have something like gapmap
in which I can split both columns and rows (by specifying the height of the tree cut).
I hope it does not add too much complexity of the code, otherwise, do not need to add these features. I do like ComplexHeatmap
the best (for using flexibility and user support ) after trying various packages.
Thanks, Ming
Thanks for the comment! That is also what I considered before, but according to current design of the package, it is not easy to support this feature. I totally agree this would be very useful and I will try to implement it once I have enough time.
To show different groups of row clusters and column clusters, I think there is another way which is to color branches of dendrogram with different colors, and this is supported by ComplexHeatmap
Thanks Zuguang! ComplexHeatmap
is now my go to package for drawing heatmaps.
I know you are better than me with R. I want to share my 2 cents on heatmaps :)
https://rpubs.com/crazyhottommy/a-tale-of-two-heatmap-functions
Ming
Thanks for sharing the post! I didn't know that the base heatmap()
scales the matrix only for coloring! I remember when I first learned heatmap()
, I was also puzzled by the color of the heatmap.
It is good that Heatmap
does not do any scaling inside :)
Hope you have some time now developing the splitting by columns :)
I bumped on to this post, and I am a lover of ComplexHeatmap! I have the same situation described by the original poster above, i.e.,
"A more detailed question is: when I do a supervised-clustering, I want to first split the columns (samples) into say 3 pre-defined subgroups first, and then do clustering within each subgroup for columns and do a k-means for all rows. Is it possible?"
I went through the above answers, but I didn't clearly understand how to achieve clustering within the subgroups of columns, while having an overall clustering for the rows. Could anybody please explain?
I don't understand how the first answer fully achieves this. Specifically, the object "km" is computed, but then not used in the code after that?
Appreciate any responses!
Thanks.
@jokergoo I used your method and it works well for small heatmaps, but if I want to have large heatmaps with many splits, it can literally take all day to plot (I'm on an x2680 CPU and I think it may take 24 hours). Is there another way to split by columns, if I might ask? Is there a way to use parallelization on ComplexHeatmap (use multiple cores to plot a single heatmap)?
@jamesdalg Since many people are requesting column splitting in heatmaps, I will put it with highest priority.
The slowest part when making heatmaps are clustering. Generally a heatmap visualization is kind like descriptive visualization that it basically aims to find patterns in sub-region in the heatmap. On the other hand, when you have very huge matrices, say millions of rows or millions of columns, if you plot it in a file or on the screen, the neighbouring rows or columns are actually merged due to the resolution of the file or the screen. So the way I always do is first to random sample from rows or columns (say ~ 5000) and the pattern for the random sampled heatmap is actually same as the complete heatmap.
If you could, if there is a way to make the .combine option in foreach take a "HeatmapList" object, that might help things a lot... or if there was a way to convert a list of Heatmap objects to a HeatmapList, that would help too (if there was public constructor from a list object). I think having a split_columns parameter might be the best though (not that I can decide).
Is there a cosmetic way to just highlight certain blocks within a heatmap? That's basically what I'm trying to do. The gaps are just there to visually set parts of the heatmap apart.
Currently you can use decorate_heatmap_body()
function. E.g.
mat = matrix(rnorm(100), 10)
Heatmap(mat, name = "test")
decorate_heatmap_body("test", {
grid.rect(0, 0, width = 0.4, height = 1, just = c("left", "bottom"),
gp = gpar(lwd = 2, col = "black", fill = "transparent"))
})
As an alternative to split columns, is there a way to draw vertical reference lines? thanks wei
This can be done by decorate_heatmap_body()
if you know where to put the vertical line.
mat = matrix(rnorm(100), 10)
Heatmap(mat, name = "foo")
decorate_heatmap_body("foo", {
# assume columns are split after the 4th column (after reordering)
grid.lines(c(4/10, 4/10), c(0, 1), gp = gpar(col = "red", lty = 2))
})
In stead of actual reference line, can we achieve the visual effect of split columns by widening the right (or left) border for a column of cells? I feel the reference lines may cover a few cells near the border? Not sure which is easier to implement.
To me, the reference lines would fairly close to what I want, but the following code only generates the reference line at slice 1. I tried to provide slide=1:6, but got the following error. thanks!
wei
Error in grid.Call.graphics(L_downvppath, name$path, name$name, strict) : Viewport 'foo_heatmap_body_6' was not found
`library(ComplexHeatmap) library(circlize)
set.seed(123) mat = cbind(rbind(matrix(rnorm(16, -1), 4), matrix(rnorm(32, 1), 8)), rbind(matrix(rnorm(24, 1), 4), matrix(rnorm(48, -1), 8)))
mat = mat[sample(nrow(mat), nrow(mat)), sample(ncol(mat), ncol(mat))] rownames(mat) = paste0("R", 1:12) colnames(mat) = paste0("C", 1:10)
Heatmap(mat, name = "foo", split = paste('long name', rep(1:6, each =2 )))
decorate_heatmap_body("foo", { grid.lines(c(4/10, 4/10), c(0, 1), gp = gpar(col = "green", lwd = 2)) grid.lines(c(7/10, 7/10), c(0, 1), gp = gpar(col = "green", lwd = 2)) } )`
Hi Question: can I use column_split without clustering the columns ?
I need to split columns in the heatmap but I am not clustering my columns, I want to retain the order in the original matrix.
I tried the following as I have 8 columns and I want to split them into 2, the first slice containing the first 4 columns and the second split, containing the last 4.
Below is the code
`f_heat <- Heatmap(as.matrix(ZZ0[, c(2:9)]),
col = inferno(100),
border = TRUE,
rect_gp = gpar(col = "black", lty = 1, lwd = 0.01),
name = "log2FC",
cluster_columns = FALSE,
cluster_rows = hclust(dist(ZZ0[, 2:9], method = "euclidean"), method = "ward.D"),
show_row_dend = FALSE,
show_row_names = FALSE,
show_column_names = FALSE,
#row_names_gp = gpar(fontsize = 4.5, fontface = "bold"),
#column_names_side = NULL,
#column_names_gp = gpar(fontsize= 10, fontface = "bold",
# col = c(rep("#440154FF", 4), rep("#440154FF", 4))),
#column_names_rot = 360,
show_heatmap_legend = TRUE,
row_split = 18,
row_title_gp = gpar(fontsize = 9, fontface = "bold"),
row_title_rot = 0,
row_gap = unit(0.5, "mm"),
cluster_row_slices = FALSE,
top_annotation = colu_anno,
column_order = c(1,2,3,4,5,6,7,8),
column_split = factor(rep(c("G", "F"), 4), levels = c("G", "F")),
cluster_column_slices = FALSE,
column_title_gp = gpar(fontsize = 9, fontface = "bold"),
column_gap = unit(0.5, "mm"))`
However, there is a column clustering happening and cannot retain the original order of the columns. The examples given at the below link in section 2.7 have been done on with clustering on columns. https://jokergoo.github.io/ComplexHeatmap-reference/book/a-single-heatmap.html#heatmap-split Any help is appreciated. Thanks !!!
Gowthamee
Hi Zuguang, I know the split_column is now implemented https://jokergoo.github.io/ComplexHeatmap-reference/book/a-single-heatmap.html#heatmap-split
Just wondering if there is an easier way to do it now
A more detailed question is: when I do a supervised-clustering, I want to first split the columns (samples) into say 3 pre-defined subgroups first, and then do clustering within each subgroup for columns and do a k-means for all rows.
I know I can have 3 sub matrix: mat1, mat2, mat3 cluster each sub-matrix on columns, concatenate the 3 matrices after clustering on columns. then split by columns using category variable and split by rows using k-means.
Is there an easier way to do it in Complexheatmap?
Thanks a lot for this amazing package!
column_split
a two-column data frame. Hierarchical clustering is automatically applied in each column slice.See the following examples:
m = matrix(rnorm(10*50), ncol = 50)
fa = sample(letters[1:4], 50, replace = TRUE)
# ha just found, column_km can be used together with column_split
Heatmap(m, column_km = 3, column_split = fa, row_split = 2)
To precisely control the order of column slices:
df = data.frame(
km = kmeans(t(m), centers = 3)$cluster,
fa = fa
)
df$km = factor(df$km, levels = c(1, 2, 3))
df$fa = factor(df$fa, levels = letters[1:4])
Heatmap(m, column_split = df, row_split = 2, cluster_column_slices = FALSE)
And maybe you can check this post to find out how to add nice annotations for the different split variables.
https://jokergoo.github.io/2020/07/06/block-annotation-over-several-slices/
Thanks so much Zuguang!!
Here is my code ht1 <- Heatmap( plotdata, name = "expression",
column_split = le, border=T, cluster_columns = F, show_column_names = F, show_row_names = F, cluster_column_slices = FALSE, column_title_gp = gpar( fill = c(HRisk='red',LRisk='blue'), alpha = 0.7, fontsize = 18 ) )
How could set gap between column_title_gp and heatmap body?
@saisaitian I think this is something I will improve. Current design where the column titles are not vertically centered is to ensure that they are aligned to title from ggplot plot if they are put together.
Currently, you can do like:
ht_opt$TITLE_PADDING = unit(c(8.5, 8.5), "points")
Heatmap(...)
ComplexHeatmap only supports splitting heatmap by rows because you can simply split the matrix by columns and concatenate each submatrix afterwards.
For your situation, I think you should do k-means clustering on the complete matrix beforehand and assign to
split
argument later.mat mat1 = mat[, 1:3] mat2 = mat[, 4:6] mat3 = mat[, 7:9] km = kmeans(mat, centers = 3)$cluster row_order = hclust(mat)$order Heamtap(mat1, row_order = row_order, cluster_rows = FALSE, split = cluster) + Heatmap(mat2) + Heatmap(mat3)
I get an error message saying object cluster not found. Also the row_order = hclust(mat)$order gave me the error: Error in if (is.na(n) || n > 65536L) stop("size cannot be NA nor exceed 65536") : missing value where TRUE/FALSE needed
I would like to split my heatmap columns in a custom way and not based on k-means. How can I use column_split to split my heatmap into colums group eg. 1-4, 5-6, 7-12 ?
Then you can do it in two steps:
fa[km %in% 1:4] = "group A"
fa[km %in% 5:6] = "group B"
fa[km %in% 7:13] = "group C"
Then assign fa
to column_split
.
Hi,
I looked at the help page for
Heatmap
, it seems only supports split on rows, and there is agap
parameter for it. Is it possible to split on columns as well?A more detailed question is: when I do a supervised-clustering, I want to first split the columns (samples) into say 3 pre-defined subgroups first, and then do clustering within each subgroup for columns and do a k-means for all rows. Is it possible?
Now, I am manually arrange the data matrix into three distinct groups, and do a K means with the rows, and
cluster_column
=FALSE
.Thanks, Ming