corybrunson / ggalluvial

ggplot2 extension for alluvial plots
http://corybrunson.github.io/ggalluvial/
GNU General Public License v3.0
499 stars 34 forks source link

How to merge two alluvial plots to 1 by the axis2 (i.e. the axis in the middle) #133

Closed bridgeovertroubledhuman closed 3 months ago

bridgeovertroubledhuman commented 3 months ago

I want to create one plot like this

titanic_wide <- data.frame(Titanic)
head(titanic_wide)

![Bildschirmfoto 2024-08-20 um 14 48 19](https://github.com/user-attachments/assets/f21f73bb-7247-4158-bcda-316d72a5b374) <- ggplot(data = titanic_wide,
       aes(axis1 = Class, 
           axis2 = Survived, 
           axis3 = Age,
           y = Freq)) +
  scale_x_discrete(limits = c("Class", "Survived", "Age Group"), expand = c(.2, .05)) +
  geom_alluvium(aes(fill = Survived), show.legend = FALSE) +
  geom_stratum() +
  geom_text(stat = "stratum", aes(label = after_stat(stratum))) +
  theme_minimal() +
  ggtitle("Passengers on the maiden voyage of the Titanic",
          "by survival")

However, after examining the data, I realized the many lines don't help me as much as I would have helped (i.e., from the plot I had no idea that there were children in the second class that survived). So I think, for my purpose a plot like this on the left

leftpartofplot <- ggplot(data = titanic_wide,
       aes(axis1 = Class, 
           axis2 = Survived, 
           y = Freq)) +
  scale_x_discrete(limits = c("Class", "Survived"), expand = c(.2, .05)) +
  geom_alluvium(aes(fill = Survived), show.legend = FALSE) +
  geom_stratum() +
  geom_text(stat = "stratum", aes(label = after_stat(stratum))) +
  theme_minimal() +
  ggtitle("Passengers on the maiden voyage of the Titanic",
          "by survival")

and this on the right

rightpartofplot  <- ggplot(data = titanic_wide,
           aes(               axis1 = Survived, 
               axis2 = Age,
               y = Freq)) +
  scale_x_discrete(limits = c( "Survived", "Age Group"), expand = c(.2, .05)) +
  geom_alluvium(aes(fill = Survived), show.legend = FALSE) +
  geom_stratum() +
  geom_text(stat = "stratum", aes(label = after_stat(stratum))) +
  theme_minimal() +
  ggtitle("Passengers on the maiden voyage of the Titanic",
          "by survival")

Would be better. I try to show it in the picture below. Is this possible with the ggaluvial function?

bridgeovertroubledhuman commented 3 months ago

Bildschirmfoto 2024-08-20 um 14 48 19

corybrunson commented 3 months ago

Hi @bridgeovertroubledhuman, have you tried using the flow stat rather than the alluvium stat? (See documentation here.) If i understand you correctly, then it's exactly what you're after.

bridgeovertroubledhuman commented 3 months ago

Oh thank you so much! I had no idea it works like this. Yes, I wanted a "memoryless flow"!

thisiswhatiwanted <- ggplot(data = titanic_wide,
aes(axis1 = Class, 
    axis2 = Survived, 
    axis3 = Age,
    y = Freq)) +
  scale_x_discrete(limits = c("Class", "Survived", "Age Group"), expand = c(.2, .05)) +
  geom_stratum() +
  geom_text(stat = "stratum", aes(label = after_stat(stratum))) +
  geom_flow(aes(fill = Survived), show.legend = FALSE) +
  theme_minimal() +
  ggtitle("Passengers on the maiden voyage of the Titanic",
          "by survival")
bridgeovertroubledhuman commented 3 months ago

Sorry @corybrunson there is one thing I cannot wrap my head around:

Why are the stratums/strata(?) colored with the colors from the alluvium (when you don't set the alpha of geom_stratum to 1) but the stratums are always white when you use a geom_flow instead of geom_alluvium?

Given the example here, on the page you linked before it seems to be dependent on the data format, i.e. if axis1, axis2, and axis3 are defined or if there are aes like these: aes(x = survey, stratum = response, alluvium = subject, y = freq, fill = response). However, I am not managing to replicate that behavior, so I just think, I might be missing something.

Thank you!

corybrunson commented 3 months ago

There are two behaviors at work here:

  1. The alluvia comprise both flows (between the strata) and lodes (overlapping the strata). This is stated in the main vignette but only really illustrated in the JOSS paper.
  2. The strata (and alluvia) are aesthetics to which specific variable must be passed, in these examples alluvium = subject and stratum = response. They will only be colored when the fill aesthetic is passed the same variable.

Does that clear it up?

bridgeovertroubledhuman commented 3 months ago

Thank you. Yeah, from both texts I didn't get the answer - however, in case someone else has the problem, this is my solution, to when you want geom_flows with lodes based on another variables and your data is in axis1 axis2 axis3 format, you'll need geom_stratum with no options, and geom_lode with the correct filling.

ggplot(data = titanic_wide, aes(axis1 = Class, axis2 = Survived, axis3 = Age, y = Freq)) + scale_x_discrete(limits = c("Class", "Survived", "Age Group") ,expand = c(.2, .05) ) + geom_stratum() + geom_lode(aes(fill = Survived), show.legend = FALSE) + geom_text(stat = "stratum", aes(label = after_stat(stratum))) + geom_flow(aes(fill = Survived), show.legend = FALSE) + theme_minimal() + ggtitle("Passengers on the maiden voyage of the Titanic", "by survival")

See the image below: solution.pdf