corybrunson / ggalluvial

ggplot2 extension for alluvial plots
http://corybrunson.github.io/ggalluvial/
GNU General Public License v3.0
499 stars 34 forks source link

Question: flows merging/overlaping on same destination stratum? #128

Closed MatthieuStigler closed 6 months ago

MatthieuStigler commented 6 months ago

I would like to draw flows that merge/overlap on the same destination stratum, which corresponds to a case where some units A and B from the first stratum get merged into a single unit B on the second stratum (and hence flows of A.1 are overlapping with flows of B.1 on B.2 to some extent). Is this possible?

This has been asked on Stack: Confluencing/Merging Flows in ggalluvial and the answer seems no, but I wanted to confirm with you?

And if not possible in a user-friendly way, is there any (dirty) hack that you could suggest, recovering the output data, changing it and re-plotting?

Thanks!

library(ggalluvial)
#> Loading required package: ggplot2
dat <- tibble::tribble(~Step_1, ~Cat, ~freq, ~flow_id,
                        "Step_1", "A", 40, 1,
                        "Step_2", "A", 40, 1,
                        "Step_1", "A", 28, 2,
                        "Step_2", "B", 28, 2,
                        "Step_1", "B", 12, 3,
                        "Step_2", "B", 12, 3,
                        "Step_1", "B", 15, 4,
                        "Step_2", "A", 15, 4)

dat |>
  ggplot(aes(x = Step_1, stratum = Cat, alluvium = flow_id,
             y = freq,
             fill = Cat, label = Cat)) +
  scale_x_discrete(expand = c(.1, .1)) +
  geom_flow() +
  geom_stratum(alpha = .9) +
  geom_text(stat = "stratum", size = 3) 

corybrunson commented 6 months ago

Hi @MatthieuStigler, thanks for checking. My short answer is that the answers on StackOverflow are correct.

One way to hack this might be to plot separate alluvium layers (using multiple calls of stat_alluvium() or geom_alluvium()) using subsets of the full data. But you'd have to somehow force the heights of the lodes and strata to agree by assigning some of the data transparent color. I think it would be easier to build the plot you have in mind from the bottom up, and this answer linked in the comments seems like a good starting point.

MatthieuStigler commented 6 months ago

thanks for your prompt answer, much appreciated!

I must say using repeated calls of geom_flow() sounds less daunting than using the code linked. Is there any chance you could give me some hints on how I should proceed?

I tried for example to do geom_flow(dat = subset(dat, flow_id %in% c(1))) + in the code above, but that seems to rescale the flows... how would you do for example to only plot some flows (say the ones from B), in the code above?

Thanks!

corybrunson commented 6 months ago

It would probably take me a while to figure out how to finagle your example from the {ggalluvial} layers; i'd have to start very small and work my way up. Normally i'm happy to spend an hour or two tinkering and troubleshooting, but the package was not designed for this use case, and i want to avoid becoming a resource for that sort of support (as with previous issues requesting gaps between strata). It might be a good question for StackOverflow—or you could restate your original question there in terms of hacking {ggalluvial} rather than using it "out of the box".

I'm sorry to be of little help! Whatever workaround you find, it would be great to post here (and on SO) in case others want to take advantage of it.

MatthieuStigler commented 6 months ago

For sure, I understand! I was hoping that the example above would be already the minimalest example to work from, but if "filtering out" a single flow while keeping the others at the same place would already take you an hour, it seems the whole task would be quite complicated so I shall rather give up on that.

Thanks for your advice and rapid responses, much appreciated!

corybrunson commented 6 months ago

Godspeed!