corybrunson / ggalluvial

ggplot2 extension for alluvial plots
http://corybrunson.github.io/ggalluvial/
GNU General Public License v3.0
499 stars 34 forks source link

Inconsistent z ordering in some of the alluvial flows #8

Closed timchurches closed 7 years ago

timchurches commented 7 years ago

ggalluvial is very nifty! However, I've noticed a few flaws in the flow rendering which look like they might be due to inconsistent z ordering when each flow is drawn (or it may be user error..) - see below:

screen shot 2017-10-24 at 13 38 42

Data to reproduce this:

x <-
structure(list(visit_id = c(1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 
10L, 11L, 12L, 13L, 14L, 15L, 16L, 17L, 18L, 19L, 20L, 21L, 22L, 
23L, 24L, 25L, 26L, 27L, 28L, 29L, 30L, 31L, 32L, 33L, 34L, 35L, 
36L, 37L, 38L, 39L, 40L, 41L, 42L, 43L, 44L, 45L, 46L, 47L, 48L, 
49L, 50L, 51L, 52L, 53L, 54L, 55L, 56L, 57L, 58L, 59L, 60L, 61L, 
62L, 63L, 64L, 65L, 66L, 67L, 68L, 69L, 70L, 71L, 72L, 73L, 74L, 
75L, 76L, 77L, 78L, 79L, 80L, 81L, 82L, 83L, 84L, 85L, 86L, 87L, 
1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L, 13L, 14L, 
15L, 16L, 17L, 18L, 19L, 20L, 21L, 22L, 23L, 24L, 25L, 26L, 27L, 
28L, 29L, 30L, 31L, 32L, 33L, 34L, 35L, 36L, 37L, 38L, 39L, 40L, 
41L, 42L, 43L, 44L, 45L, 46L, 47L, 48L, 49L, 50L, 51L, 52L, 53L, 
54L, 55L, 56L, 57L, 58L, 59L, 60L, 61L, 62L, 63L, 64L, 65L, 66L, 
67L, 68L, 69L, 70L, 71L, 72L, 73L, 74L, 75L, 76L, 77L, 78L, 79L, 
80L, 81L, 82L, 83L, 84L, 85L, 86L, 87L, 1L, 2L, 3L, 4L, 5L, 6L, 
7L, 8L, 9L, 10L, 11L, 12L, 13L, 14L, 15L, 16L, 17L, 18L, 19L, 
20L, 21L, 22L, 23L, 24L, 25L, 26L, 27L, 28L, 29L, 30L, 31L, 32L, 
33L, 34L, 35L, 36L, 37L, 38L, 39L, 40L, 41L, 42L, 43L, 44L, 45L, 
46L, 47L, 48L, 49L, 50L, 51L, 52L, 53L, 54L, 55L, 56L, 57L, 58L, 
59L, 60L, 61L, 62L, 63L, 64L, 65L, 66L, 67L, 68L, 69L, 70L, 71L, 
72L, 73L, 74L, 75L, 76L, 77L, 78L, 79L, 80L, 81L, 82L, 83L, 84L, 
85L, 86L, 87L), obs_timing = structure(c(1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L
), .Label = c("Triage obs", "First ED obs", "Last ED obs"), class = c("ordered", 
"factor")), raps = c(11, 2, 3, 2, 3, 2, 2, 4, 2, 3, 4, 2, 5, 
3, 3, 4, 5, 4, 2, 4, 5, 3, 5, 2, 3, 3, 5, 3, 3, 2, 5, 6, 2, 3, 
2, 2, 12, 2, 3, 4, 3, 5, 3, 2, 4, 2, 2, 3, 2, 2, 2, 3, 3, 2, 
2, 4, 4, 3, 2, 2, 3, 2, 4, 3, 4, 3, 2, 3, 2, 4, 6, 3, 9, 2, 3, 
2, 2, 7, 2, 2, 4, 2, 4, 2, 3, 2, 3, 11, 0, 2, 2, 5, 0, 0, 4, 
0, 5, 1, 0, 5, 4, 0, 6, 2, 4, 1, 3, 2, 6, 2, 2, 3, 3, 4, 6, 4, 
2, 6, 2, 2, 3, 2, 0, 5, 2, 3, 2, 4, 4, 2, 2, 0, 3, 2, 4, 1, 3, 
0, 2, 2, 0, 2, 4, 6, 2, 0, 2, 0, 0, 2, 3, 7, 7, 5, 6, 0, 2, 1, 
2, 10, 2, 2, 2, 0, 6, 2, 3, 0, 2, 4, 0, 7, 2, 6, 7, 2, 2, 2, 
2, 0, 0, 2, 2, 5, 1, 0, 6, 3, 0, 6, 2, 3, 2, 2, 2, 6, 2, 0, 4, 
1, 0, 6, 6, 0, 0, 0, 0, 2, 2, 2, 0, 0, 3, 2, 2, 1, 2, 2, 0, 2, 
2, 4, 0, 0, 0, 2, 2, 0, 0, 2, 6, 2, 0, 0, 0, 0, 2, 3, 6, 1, 2, 
6, 0, 2, 1, 0, 6, 2, 2, 2, 0, 2, 0, 2, 0, 2, 0, 0, 4, 2, 5), 
    raps_grp = structure(c(2L, 6L, 6L, 6L, 6L, 6L, 6L, 5L, 6L, 
    6L, 5L, 6L, 5L, 6L, 6L, 5L, 5L, 5L, 6L, 5L, 5L, 6L, 5L, 6L, 
    6L, 6L, 5L, 6L, 6L, 6L, 5L, 4L, 6L, 6L, 6L, 6L, 1L, 6L, 6L, 
    5L, 6L, 5L, 6L, 6L, 5L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 
    6L, 5L, 5L, 6L, 6L, 6L, 6L, 6L, 5L, 6L, 5L, 6L, 6L, 6L, 6L, 
    5L, 4L, 6L, 3L, 6L, 6L, 6L, 6L, 4L, 6L, 6L, 5L, 6L, 5L, 6L, 
    6L, 6L, 6L, 2L, 7L, 6L, 6L, 5L, 7L, 7L, 5L, 7L, 5L, 7L, 7L, 
    5L, 5L, 7L, 4L, 6L, 5L, 7L, 6L, 6L, 4L, 6L, 6L, 6L, 6L, 5L, 
    4L, 5L, 6L, 4L, 6L, 6L, 6L, 6L, 7L, 5L, 6L, 6L, 6L, 5L, 5L, 
    6L, 6L, 7L, 6L, 6L, 5L, 7L, 6L, 7L, 6L, 6L, 7L, 6L, 5L, 4L, 
    6L, 7L, 6L, 7L, 7L, 6L, 6L, 4L, 4L, 5L, 4L, 7L, 6L, 7L, 6L, 
    2L, 6L, 6L, 6L, 7L, 4L, 6L, 6L, 7L, 6L, 5L, 7L, 4L, 6L, 4L, 
    4L, 6L, 6L, 6L, 6L, 7L, 7L, 6L, 6L, 5L, 7L, 7L, 4L, 6L, 7L, 
    4L, 6L, 6L, 6L, 6L, 6L, 4L, 6L, 7L, 5L, 7L, 7L, 4L, 4L, 7L, 
    7L, 7L, 7L, 6L, 6L, 6L, 7L, 7L, 6L, 6L, 6L, 7L, 6L, 6L, 7L, 
    6L, 6L, 5L, 7L, 7L, 7L, 6L, 6L, 7L, 7L, 6L, 4L, 6L, 7L, 7L, 
    7L, 7L, 6L, 6L, 4L, 7L, 6L, 4L, 7L, 6L, 7L, 7L, 4L, 6L, 6L, 
    6L, 7L, 6L, 7L, 6L, 7L, 6L, 7L, 7L, 5L, 6L, 5L), .Label = c("12+", 
    "10-11", "8-9", "6-7", "4-5", "2-3", "0-1"), class = c("ordered", 
    "factor"))), .Names = c("visit_id", "obs_timing", "raps", 
"raps_grp"), row.names = c(NA, -261L), class = c("tbl_df", "tbl", 
"data.frame"))

Code to reproduce this:

# devtools::install_github("corybrunson/ggalluvial", build_vignettes = TRUE)
library(ggalluvial)
library(scales)

minraps <- 2

p <- x %>% ggplot(aes(x=obs_timing, stratum=raps_grp, alluvium=visit_id, fill=raps_grp, label=raps_grp)) + geom_flow(stat="alluvium", lode.guidance = "leftright") + geom_stratum() + labs(x="Time observations taken", y="No. of patients", fill="RAPS group", title="Change in RAPS in ED patients", subtitle=paste("Only patients with triage RAPS ≥", minraps, "shown"))  + scale_y_continuous(breaks=pretty_breaks(n=10))

black_plot <- p + scale_fill_brewer(type="div", palette=9) + theme_void() + theme(plot.background=element_rect(fill="black"), panel.background = element_rect(fill = "black"), axis.title = element_text(colour = "white"), axis.title.x = element_blank(), axis.text = element_text(colour = "white"), axis.ticks.y = element_line(colour = "white"), legend.title = element_text(colour = "white"), legend.text = element_text(colour = "white"), plot.title = element_text(colour = "white"), plot.subtitle = element_text(colour = "white"),plot.margin=unit(c(0.5,0.5,0.5,0.5),"cm"))

print(black_plot)       
corybrunson commented 7 years ago

Thank you for the compliment!

If i understand correctly (i had to Google "z-ordering"), the issue is that the z-ordering of two flows influences the color of their overlapping region, so that when multiple flows are interwoven an apple pie–type pattern appears. I've wondered about this—i think the Titanic example in the vignette has this problem—but i haven't figured out how to stop it. Certainly this is an issue that needed to be raised, and i'll look into it.

In the meantime, if the focus of the diagram isn't the alluvia (individual patients) but rather the flows between strata (transitions between RAPS groups), then you could merge the alluvial segments into flows by replacing

geom_flow(stat="alluvium", lode.guidance = "leftright")

with

geom_flow(stat="flow")

This would obviate the issue by having only one overlap between two flows of a given pair of colors.

corybrunson commented 7 years ago

I tried dplyr::arrange_()ing the data internally (in StatAlluvium$compute_panel()) by the aesthetic variables and then redefining the group variable that geom_alluvial() uses to plot the alluvia. This way, geom_alluvial() plots all alluvia of a given fill (or other aesthetic) value before moving on to the next. For me, it fixes both the example in the vignette and the example you've provided. Give it a try by installing from the new z-ordering branch:

devtools::install_github("corybrunson/ggalluvial", ref = "z-ordering")

Let me know how it goes!

timchurches commented 7 years ago

Cory,

Thanks! Firstly, changing to geom_flow(stat="flow") fixed the problem, although the individual alluvia are nice. But the good news is that the fixes on the z-ordering branch completely fix the problem - it's now perfect even when individual alluvia are plotted.

corybrunson commented 7 years ago

Great! I'll synchronize the ordering with that of stat_flow() for consistency, then merge the changes into master.