Closed RoganGrant closed 4 years ago
Hi @RoganGrant, thanks for the endorsement!
What you're after is actually pretty common in Sankey plots generally, especially those that allow free-floating nodes rather than stacked strata. Many software packages enable this, but i don't know of any ggplot2 extensions. One thing you can do in ggalluvial is complete the data frame with NA
(missing) values of group
, which will show up as grey graphical objects by default. Since those who skip one step but continue on to a future step are (i presume) different from those who stop at that step, this may be what you want—it will preserve the gradual shrinking of the stacked histograms, just with one off-color stratum at each axis. Another option is to complete the data frame in the same way but use the y
parameter to shrink those boxes to zero height. As a result, the flows will still be plotted but they will not contribute to the height of the stacked histogram. Both options are illustrated below. (I need to better document the difference in behavior between stat_alluvium()
and stat_flow()
with respect to NA
.)
I hope this helps!
library(ggplot2)
library(ggalluvial)
#> Warning: package 'ggalluvial' was built under R version 4.0.2
ex <- data.frame(
sample = c(rep(1, 2), rep(2,2), rep(3, 3), rep(4, 3), rep(5, 3)),
test = c(rep(c("test1", "test3"), 2), rep(c("test1", "test2", "test3"), 3)),
group = c(rep(c("group1"), 4), rep(c("group2"), 9))
)
ggplot(ex, aes(x = test, stratum = group, alluvium = sample,
fill = group, label = group)) +
geom_flow(stat = "alluvium") +
geom_stratum()
ex <- tidyr::complete(ex, sample, test)
ggplot(ex, aes(x = test, stratum = group, alluvium = sample,
fill = group, label = group)) +
geom_alluvium() +
geom_stratum()
#> Warning in f(...): Some differentiation aesthetics vary within alluvia, and will be diffused by their first value.
#> Consider using `geom_flow()` instead.
ex$n <- ifelse(is.na(ex$group), 0, 1)
ex <- tidyr::fill(ex, group)
ggplot(ex, aes(x = test, stratum = group, alluvium = sample,
fill = group, label = group, y = n)) +
geom_flow(stat = "alluvium") +
geom_stratum()
Created on 2020-07-30 by the reprex package (v0.3.0)
Thanks so much! Any of these will probably work.
For anyone with the same question, I used a hybrid solution of this and my own to get what I was after:
hide
, which can isolate these samples (just factored as T/F)geom_alluvium()
to get complete curves without breaksgeom_stratum(aes(alpha = hide, color = hide))
scale_color_manual(name = "",
values = c("FALSE" = "black",
"TRUE" = alpha("white", 0))) +
scale_alpha_manual(name = "",
values = c("FALSE" = 1,
"TRUE" = 0))
Flows straight past, as I wanted! ![Uploading Screen Shot 2020-07-30 at 7.26.45 PM.png…]()
First of all, thank you; this is a fantastic package!
I recognize that this is a rare use-case, but I am making an alluvial diagram to show a workflow. For one group, step 2 is not performed, whereas subsequent steps are. This unfortunately means that no alluvial lines are drawn for this group from column 1. Would it be possible to allow the flow lines to "skip" a column and directly point to a subsequent one? In the attached image, the group of interest would be the topmost (blue-grey).
Minimal code to encounter the same issue: