corybrunson / ggalluvial

ggplot2 extension for alluvial plots
http://corybrunson.github.io/ggalluvial/
GNU General Public License v3.0
499 stars 34 forks source link

Highlight specific links in ggalluvial with data in lodes format #101

Closed dcarbajo closed 2 years ago

dcarbajo commented 2 years ago

I am wondering if it is possible to highlight specific links between nodes in an alluvial plot.

I manage to accomplish such task with circlize::chordDiagram() for circos plots, as it can accept a color palette for the grid (nodes) and a different one for the links.

However, even when I thought would be much easier with ggalluvial, I find myself stuck. The main problem, as seen in the MWE below, is that I don't see a possibility to add a color palette for the nodes and a different one for the links.

I made this MWE with a very similar structure to my real data, using the UCBAdmissions data, to show what I mean.

  ucb_df <- as.data.frame(UCBAdmissions)
  ucb_df$Admit <- NULL
  ucb_df$Highlight <- FALSE
  ucb_df$Highlight[ucb_df$Gender=='Female' & ucb_df$Dept %in% c('A','D')] <- TRUE

  ucb_lodes <- ggalluvial::to_lodes_form(ucb_df, axes = 1:2, id = "cohort")

  col_vector <- RColorBrewer::brewer.pal(9, 'Set1')
  all_colors <- grDevices::colorRampPalette(col_vector)

  plot_palette <- c(col_vector[1:length(levels(ucb_df$Gender))], col_vector[1:length(levels(ucb_df$Dept))])
  plot_palette <- stats::setNames(plot_palette, c(levels(ucb_df$Gender), levels(ucb_df$Dept)))

  P <- ggplot2::ggplot(ucb_lodes, ggplot2::aes(x = x, stratum = stratum, y = Freq)) +
    ggplot2::scale_x_discrete(expand = c(.1, .1)) +
    ggplot2::scale_fill_manual(values=plot_palette) +
    ggalluvial::stat_stratum(geom = "stratum", alpha = .5, ggplot2::aes(fill = stratum)) +
    ggalluvial::stat_stratum(geom = "text", ggplot2::aes(label = stratum), size = 3) +
    ggalluvial::geom_flow(ggplot2::aes(alluvium = cohort, fill = stratum)) +
    ggplot2::ggtitle("test with highlights - 1") +
    ggplot2::scale_y_continuous(name="Freq") +
    ggplot2::theme_light() +
    ggplot2::theme(plot.title=ggplot2::element_text(face="bold",size=20),
                   axis.text.x=ggplot2::element_text(size=16),
                   axis.text.y=ggplot2::element_text(size=8),
                   axis.title.x=ggplot2::element_blank(),
                   axis.title.y=ggplot2::element_text(size=16),
                   legend.position="none")
  grDevices::pdf(file="highlights_test1.pdf", height=6, width=8)
  print(P)
  grDevices::dev.off()

This produces the following plot:

Screenshot 2022-05-31 at 2 21 51 PM

So far so good, but now I want to highlight specific links and nodes, in particular Female going to Depts A and D.

For that, I already have the Highlight column in my data. Now I just modify the color palette to grey out the links I don't want to highlight.

  plot_palette <- stats::setNames(rep("grey80",length(plot_palette)), names(plot_palette))
  high_colors <- data.frame(ID=c('Female','A','D'), color=c('black','red','blue'))
  high_colors <- high_colors[match(names(plot_palette)[names(plot_palette) %in% high_colors$ID], high_colors$ID),]
  plot_palette[names(plot_palette) %in% high_colors$ID] <- high_colors$color

  P <- ggplot2::ggplot(ucb_lodes, ggplot2::aes(x = x, stratum = stratum, y = Freq)) +
    ggplot2::scale_x_discrete(expand = c(.1, .1)) +
    ggplot2::scale_fill_manual(values=plot_palette) +
    ggalluvial::stat_stratum(geom = "stratum", alpha = .5, ggplot2::aes(fill = stratum)) +
    ggalluvial::stat_stratum(geom = "text", ggplot2::aes(label = stratum), size = 3) +
    ggalluvial::geom_flow(ggplot2::aes(alluvium = cohort, fill = stratum)) +
    ggplot2::ggtitle("test with highlights - 2") +
    ggplot2::scale_y_continuous(name="Freq") +
    ggplot2::theme_light() +
    ggplot2::theme(plot.title=ggplot2::element_text(face="bold",size=20),
                   axis.text.x=ggplot2::element_text(size=16),
                   axis.text.y=ggplot2::element_text(size=8),
                   axis.title.x=ggplot2::element_blank(),
                   axis.title.y=ggplot2::element_text(size=16),
                   legend.position="none")
  grDevices::pdf(file="highlights_test2.pdf", height=6, width=8)
  print(P)
  grDevices::dev.off() 

This gives me this plot, which is not what I want:

Screenshot 2022-05-31 at 2 22 06 PM

What I would actually want is the following I mocked up:

Screenshot 2022-05-31 at 2 23 56 PM

Is that possible? I tried fill = Highlight in the geom_flow() line, but it didn't work... Seems like I would need to provide a second fill palette for the flow links. Hope you can help me!

Thanks!

corybrunson commented 2 years ago

Hi @dcarbajo and thanks for the detailed reproducible example.

I believe the problem lies with your definition of plot_palette, which does not include values associated with TRUE and FALSE taken by the Highlight variable. The following change should result in those two flows being a darker shade than the others:

plot_palette <- c(plot_palette, `TRUE` = "black", `FALSE` = "grey")

However, Highlight only takes two values, so using it as an aesthetic only enables you to use two colors, here black and grey—not three, e.g. black, red, and blue. To accomplish that, you might redefine Highlight upstream as an interaction between itself and Dept. Let me know if that doesn't turn out to work either!

dcarbajo commented 2 years ago

Thanks for your comment, it really helped. Just needed to add the Highlight levels into the plot_palette. This is one of the plots with my real data, working well.

Screenshot 2022-06-02 at 12 39 52 PM

corybrunson commented 2 years ago

Great! Thanks for confirming.