talgalili / heatmaply

Interactive Heat Maps for R Using plotly
377 stars 75 forks source link

Combining `heatmaply` with plotly visualizations #207

Open michaelweylandt opened 5 years ago

michaelweylandt commented 5 years ago

Hi @talgalili,

Is it possible to combine heatmaply with plotly animations (https://plot.ly/r/animations/)? I.e., could we supply a series of same-size matrices and transition between them using the animation functionality?

I expect this feature is not yet implemented and I'm willing to put in the elbow grease to implement it myself, but I'd appreciate any comments on how feasible this would be.

For background, I am one of the developers of an R package which implements convex bi-clustering (https://github.com/DataSlingers/clustRviz). Convex bi-clustering is a penalized method, so as the penalty parameter is varied, a series of increasingly smooth estimated heatmaps are created. (See slides 19-24 of http://www.stat.rice.edu/~gallen/cobra_talk_ibright.pdf)

While we can compute these solutions relatively quickly, our visualizations are a bit more limited. We currently have i) heatmaply output for a single lambda value; and ii) a shiny app showing the output of gtools::heatmap.2 as lambda as varied. I'm interested in doing something a bit more "web-native" and combining these two.

alanocallaghan commented 5 years ago

Interesting application, thanks for getting in touch! This isn't something that is directly possible at the moment, but it could probably be made to happen with a little bit of work. Transitioning between different matrix values is probably not that difficult (ie, the "main" section of the heatmap). This would simply involve adding a frame attribute to the heatmap plot. However as you are dealing with regularised clustering, I am guessing that the interesting part is transitioning between dendrograms. This might be a little bit more tricky, but should still be doable. I'll outline some implementation notes (for everyone's sake) in another comment.

btw, I noticed on your README what looks like an issue I've been having in Firefox recently with heatmaply plots:

For me, plots initially appear like this, and the "invisible" or non-rendered parts appear after I move the mouse over those regions of the plot. Is this the same for you? If so, is it also a Firefox issue?

alanocallaghan commented 5 years ago

Probably the first thing to do is to create a working prototype. I'll add this in a further comment.

The way heatmaply works internally is to first run heatmapr on the input, which returns a heatmapr object, and then to run heatmaply on that output. We could easily implement a heatmapr.list function that calls itself recursively on the elements of the list - this would produce a list of heatmapr objects in the heatmapr case.

Presumably a lot of the arguments would need to handle lists/vectors too - for example, the row/column side colours, row/column dendrograms, etc, while some will need to be kept constant across all plots (eg, dendrogram = "row" in one and dendrogram = "none" in another would be a mess).

Then we just need to alter the plotting code to handle multiple lists. It will likely be a lot easier to work with the plotly R API rather than using ggplot2 and ggplotly, so we should probably be using plot_method="plotly". In this function, plotly_heatmap and plotly_dend are the two main workhorses for plotting. Both of these at the moment are just functions, but again they could be converted to generics which handle lists of dendrograms. Alternatively, we could alter the existing code to handle either case, but IMO generics and methods would be cleaner.

Implementing these changes at the moment will probably lead to some code duplication, which may be easiest to deal with as it arises. I'd be happy to do code review and help to contribute, as this would be quite a cool and (imo unique) feature.

alanocallaghan commented 5 years ago
library("plotly")
library("heatmaply")
library("dendextend")

dists <- c("euclidean", "maximum", "manhattan", "canberra", "binary", "minkowski")

dends <- lapply(dists, function(dist) {
    hclust(dist(mtcars, method = dist))
})
dends1 <- lapply(dists, function(dist) {
    hclust(dist(t(mtcars), method = dist))
})

plotly_dend_frame <- function(dends, side = c("row", "column"), dend_hoverinfo = TRUE) {

  side <- match.arg(side)

  segs <- lapply(dends, function(dend) {
    if (dendextend::is.hclust(dend)) {
      dend <- as.dendrogram(dend)
    }

    dend_data <- dendextend::as.ggdend(dend)
    segs <- dend_data$segments

    ## Have to get colors back from dendrogram otherwise plotly will make some up
    if (is.null(segs$col) || all(is.na(segs$col))) {
      segs$col <- rep(1, length(segs$col))
    }
    segs$col[is.na(segs$col)] <- "black" # default value for NA is "black"

    if (is.numeric(segs$col)) {
      segs$col <- factor(segs$col)
    }
    segs
  })
  segs <- lapply(seq_along(segs), 
    function(i) { 
      segs[[i]]$frame <- i
      segs[[i]]
    }
  )
  segs <- do.call(rbind, segs)

  lab_max <- nrow(dends[[1]]$labels)
  if (side == "row") {
    lab_max <- lab_max + 0.5
  }

  axis1 <- list(
    title = "",
    range = c(0, max(segs$y)),
    linecolor = "#ffffff",
    showgrid = FALSE
  )
  axis2 <- list(
    title = "",
    range = c(0, lab_max),
    linecolor = "#ffffff",
    showgrid = FALSE
  )
  if (side == "row") {
    add_plot_lines <- function(p) {
      p %>%
        add_segments(
          x = ~y, xend = ~yend, y = ~x, yend = ~xend, color = ~col,
          frame = ~ frame,
          showlegend = FALSE,
          hoverinfo = if (dend_hoverinfo) "x" else "none"
        ) %>%
        layout(
          hovermode = "closest",
          xaxis = axis1,
          yaxis = axis2
        )
    }
  } else {
    add_plot_lines <- function(p) {
      p %>%
        add_segments(
          x = ~x, xend = ~xend, y = ~y, yend = ~yend, color = ~col,
          frame = ~ frame,
          showlegend = FALSE,
          hoverinfo = if (dend_hoverinfo) "y" else "none"
        ) %>%
        layout(
          hovermode = "closest",
          xaxis = axis2,
          yaxis = axis1
        )
    }
  }

  p <- plot_ly(segs) %>% add_plot_lines()
  p
}

row_dend <- plotly_dend_frame(dends)
col_dend <- plotly_dend_frame(dends1, side = "column")

plotly_heatmap_frame <- function(mats) {

  dfs <- lapply(seq_along(mats), function(i) {
    mat <- mats[[i]]
    data.frame(
      z = as.vector(as.matrix(mat)),
      x = rep(seq_len(ncol(mtcars)), each = nrow(mtcars)),
      y = seq_len(nrow(mtcars)),
      frame = i
    )
  })
  input <- do.call(rbind, dfs)

  p <- plot_ly(input,
    z = ~z, x = ~x, y = ~y,
    frame = ~frame,
    type = "heatmap", showlegend = FALSE, hoverinfo = "text"
  ) %>%
    layout(
      xaxis = list(
        tickvals = 1:ncol(mats[[1]]), ticktext = colnames(mats[[1]]),
        linecolor = "#ffffff",
        range = c(0.5, ncol(mats[[1]]) + 0.5),
        showticklabels = TRUE
      ),
      yaxis = list(
        tickvals = 1:nrow(mats[[1]]), ticktext = rownames(mats[[1]]),
        linecolor = "#ffffff",
        range = c(0.5, nrow(mats[[1]]) + 0.5),
        showticklabels = TRUE
      )
    )
  p
}

permuted <- lapply(seq_along(dends), function(i) {
  mtcars[
    sample(seq_len(nrow(mtcars)), nrow(mtcars)), 
    sample(seq_len(ncol(mtcars)), ncol(mtcars))]
})
heatmap <- plotly_heatmap_frame(permuted)

subplot(
  col_dend, 
  plotly_empty(),
  heatmap, 
  row_dend, 
  shareX = TRUE, shareY = TRUE, nrows = 2)
michaelweylandt commented 5 years ago

Hi @Alanocallaghan,

This is fantastic! Thank you. In rough order:

Yeah - the README currently looks like crud, but I think that's an artifact of how rmarkdown embeds heatmaply (images created by the webshot package), not Firefox. If you look at the pkgdown site (https://dataslingers.github.io/clustRviz/) it seems to render fine in Firefox, even before mouse-over. (Checked on Windows and Mac)

For my application, the row- and column-dendrograms are actually fixed as they are constructed by seeing the values of the regularization parameter at which different rows / columns fuse, so I wouldn't need quite the generality you've sketched out. (Though for visualization, I'd probably want to highlight different heights on the dendrograms.)

Building off your prototype, I think something like the following would work for my application:

library("plotly")
library("heatmaply")
library("dendextend")
library("clustRviz")
library("dplyr")

if(!exists("cbass_fit")){
  cbass_fit <- CBASS(presidential_speech)
}

GAMMA_GRID <- seq(0, 1, length.out = 21)

build_dend_plot <- function(fit, side = c("row", "col"), dend_hoverinfo = TRUE){
  side <- match.arg(side)

  segment_info <- lapply(seq_along(GAMMA_GRID), function(g_ix){
    g <- GAMMA_GRID[g_ix]

    dend <- as.dendrogram(fit, type = side)
    dend_data <- dendextend::as.ggdend(dend)$segments

    dend_data$col <- ifelse(dend_data$yend < g, "red4", "blue4")
    dend_data$frame <- g_ix

    dend_data
  })

  segment_info <- dplyr::bind_rows(segment_info)
  segment_info$col <- factor(segment_info$col)

  if(side == "row"){
    lab_max <- fit$n + 0.5
  } else {
    lab_max <- fit$p
  }

  axis1 <- list(
    title = "",
    range = c(0, max(segment_info$y)),
    linecolor = "#ffffff",
    showgrid = FALSE
  )
  axis2 <- list(
    title = "",
    range = c(0, lab_max),
    linecolor = "#ffffff",
    showgrid = FALSE
  )
  if (side == "row") {
    add_plot_lines <- function(p) {
      p %>%
        add_segments(
          x = ~y, xend = ~yend, y = ~x, yend = ~xend, color = ~col,
          frame = ~ frame,
          showlegend = FALSE,
          hoverinfo = if (dend_hoverinfo) "x" else "none"
        ) %>%
        layout(
          hovermode = "closest",
          xaxis = axis1,
          yaxis = axis2
        )
    }
  } else {
    add_plot_lines <- function(p) {
      p %>%
        add_segments(
          x = ~x, xend = ~xend, y = ~y, yend = ~yend, color = ~col,
          frame = ~ frame,
          showlegend = FALSE,
          hoverinfo = if (dend_hoverinfo) "y" else "none"
        ) %>%
        layout(
          hovermode = "closest",
          xaxis = axis2,
          yaxis = axis1
        )
    }
  }

  p <- plot_ly(segment_info) %>% add_plot_lines()
  p
}

build_heatmap_plot <- function(fit){
  plot_data <- lapply(seq_along(GAMMA_GRID), function(g_ix) {
    g <- GAMMA_GRID[g_ix]
    u_hat <- get_clustered_data(fit, percent = g)

    tibble::tibble(
      z = as.vector(u_hat),
      x = as.vector(col(u_hat)),
      y = as.vector(row(u_hat)),
      frame = g_ix
    )
  })

  plot_data <- dplyr::bind_rows(plot_data)
  X <- fit$X

  p <- plot_ly(plot_data,
               z = ~z, x = ~x, y = ~y,
               frame = ~frame,
               type = "heatmap", showlegend = FALSE, hoverinfo = "text"
  ) %>%
    layout(
      xaxis = list(
        tickvals = seq_len(NCOL(X)), ticktext = colnames(X),
        linecolor = "#ffffff",
        range = c(0.5, NCOL(X) + 0.5),
        showticklabels = TRUE
      ),
      yaxis = list(
        tickvals = seq_len(NROW(X)), ticktext = rownames(X),
        linecolor = "#ffffff",
        range = c(0.5, NROW(X) + 0.5),
        showticklabels = TRUE
      )
    )
  p
}

row_dend <- build_dend_plot(cbass_fit, side = "row")
col_dend <- build_dend_plot(cbass_fit, side = "col")
heatmap  <- build_heatmap_plot(cbass_fit)

subplot(col_dend, plotly_empty(), heatmap, row_dend, shareX = TRUE, shareY = TRUE, nrows = 2)

It's still super rough, but a very exciting prototype! Let me take a few days to think about when I'll be able to put some time into making this work for real and mock-up an interface before I start hacking something together.

talgalili commented 4 years ago

Are there plans to introduce this to heatmaply? If not, can we close this issue?

michaelweylandt commented 4 years ago

Hi @talgalili,

Sorry for the delayed reply.

I've been working with @bhmbhm to adapt @Alanocallaghan's great example code into a form that works for our application.

I think it's at a point that we'd be happy to upstream it to y'all, but I'm not sure if there are many applications for "movie heatmaps" outside of my convex bi-clustering context. The only things that come to mind are heatmaps over time (e.g., longitudinal studies with a fixed population) but I don't think you'd have the fixed row and column dendrograms then.

Is this something you'd be interested in having? If so, @bhmbhm and I can put an example online and send a PR to start the detailed discussion.

talgalili commented 4 years ago

Hi Michael, I don't understand what you describe. Could you please share an example so I'd have a better sense of what the feature is?

Thanks.

On Mon, Aug 19, 2019 at 1:19 AM Michael Weylandt notifications@github.com wrote:

Hi @talgalili https://github.com/talgalili,

Sorry for the delayed reply.

I've been working with @bhmbhm https://github.com/bhmbhm to adapt @Alanocallaghan https://github.com/Alanocallaghan's great example code into a form that works for our application.

I think it's at a point that we'd be happy to upstream it to y'all, but I'm not sure if there are many applications for "movie heatmaps" outside of my convex bi-clustering context. The only things that come to mind are heatmaps over time (e.g., longitudinal studies with a fixed population) but I don't think you'd have the fixed row and column dendrograms then.

Is this something you'd be interested in having? If so, @bhmbhm https://github.com/bhmbhm and I can put an example online and send a PR to start the detailed discussion.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/talgalili/heatmaply/issues/207?email_source=notifications&email_token=AAHOJBXO2TM2P3OKKFPDDJ3QFHDIHA5CNFSM4GTTA6D2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD4RJSEQ#issuecomment-522361106, or mute the thread https://github.com/notifications/unsubscribe-auth/AAHOJBRYDIFU477DBUAKKXLQFHDIHANCNFSM4GTTA6DQ .

alanocallaghan commented 4 years ago

I'd be interested in seeing some examples. Perhaps it's too complex to merge to heatmaply, but it may be nice to develop as a separate package.

michaelweylandt commented 4 years ago

@talgalili: Consider a scenario where you have T different n-by-p matrices and you want to view them as heatmaps. We want to visualize this as a movie (T being time) so we are using heatmaply + plotly's animation functionality.

In our case, we are getting our heatmaps via Convex BiClustering, so the different heatmaps are really different degrees of smoothing the same raw data. See, e.g., Slide 10 (pp.19-25) of https://www.math.wustl.edu/~kuffner/WHOA-PSI-3/GeneveraAllen-slides.pdf for a low-tech version.

If you want dendrograms on the sides as with static heatmaply, there's a bit of additional complexity: if you are re-seriating at each frame, that makes interpretation tricky. In the convex bi-clustering case, we construct the dendrogram once and fix it across different values of lambda / time.

I'm not sure about examples other than convex bi-clustering: maybe time-course gene expression data. (You have to have the same subjects and genes at each time point so it makes sense to align things.)

@bhmbhm: Can we put an example online for @talgalili and @Alanocallaghan to look at? Doesn't need to be anything fancy.

talgalili commented 4 years ago

I'd love for an example. In general, it sounds to me like your are not needing the interactive functionality of zooming, nor are you interested in the dendrograms, but rather that you wish to have an animation of a series of heatmaps. In such case, it sounds like something that is worth putting in a package (although it may make more sense to add it to its own package). If you wish it to be part of heatmaply, and think it makes sense, I'm happy to let you send a PR, but please be aware that you will be responsible dealing with future bug reports and issues (and if there are major problems, we may just need to remove the code if we won't be able to support it).

Makes sense?