corybrunson / ggalluvial

ggplot2 extension for alluvial plots
http://corybrunson.github.io/ggalluvial/
GNU General Public License v3.0
497 stars 34 forks source link

Alignment of geom_text combined with stat="flow" to reflect the number of input/output cases (alluvia) #58

Closed Generalized closed 4 years ago

Generalized commented 4 years ago

Dear Community, Let's assume I have the following alluvial graph (as below) with the following portion of code: geom_text(aes(label=n), stat = "flow", size=2.5, show.legend = FALSE, nudge_x = -0.5)

This generates the smaller numbers, telling the reader the number of subjects "leaving" the previous time point and "entering" the next one. But they are mixed together. It's hard to guess which one is which.

Is there any way to align the number of cases (subject_id) that "enter" the time point (my X axis is about time points) to the left of this time point, and cases that "leave" the previous one to its right?

I cannot provide the real data, but let me illustrate this: obraz

For the Baseline, we have 30 subjects in the green box. 28 of them "leaves" this time point, and goes to the green area at M3 time point. 1 leaves the green baseline and goes to the red area at M3, and 1 leaves the green baseline and goes to the grey rectangle (bottom) at M3.

For M3, 25 subjects leave the green box and 25 enters the green box at M6. 2 "greeners" leave M3 and enters the red box at M6. 1 "greener" leaves M3 and enters the grey box at M6. 25 + 3 = 28 = the number of subjects at M3.

I would like to align these small numbers to left/right of the appropriate time point areas, to reflect the number of items "entering" and "leaving" it in an alluvium. nudge_x shifts all of them. Probably I should map the x position in aes(x=......), but how to determine when this value corresponds to an alluvium "leaving" and "entering" a certain time point?

Since I don't provide the reproducible code, please tell me only if it's possible and which option should be used?

Just to get rough idea of the entire code (irrelevant, formatting parts are cut):

ggplot(data = changes,
       aes(x = time, label=n, stratum=change, alluvium = subject_id)) +
  geom_flow(aes(fill =change), stat = "alluvium", lode.guidance = "forward", cement.alluvia = TRUE) +
  geom_stratum(aes(fill = change), alpha = .8) + 
  geom_text(aes(label=n), stat = "stratum", size=2.8) +
  geom_text(aes(label=n, color = time == "Baseline"), stat = "flow", size=2.5, show.legend = FALSE, nudge_x = -.5)
corybrunson commented 4 years ago

Good example. This is not possible now but will be soon!

The current release does not distinguish between incoming and outgoing flows in the outputs of the statistical transformations (Stat*$compute_*()), so the plot layers of ggalluvial cannot accomplish this.

The devel branch has a solution using computed variables, which will be included in the next release. If you want to try it out now, you can install from that branch and see the uses of after_stat() in the order-rectangles vignette. In this case i think you'll want to use after_stat(count * (flow == "to")) for one side and the same with "from" for the other side (along with suitable values of nudge_x).

Note that the solution is only for stat_flow(), since stat_alluvium() layers have same-height ribbons on either side of each axis.

If that's not clear, or if you have trouble, i can give it a try with some toy data tomorrow.

corybrunson commented 4 years ago

@Generalized this should be possible now in v0.12.0. Check out the vignette on ordering rectangles for some examples and please let me know if you encounter any problems!