Open Yingjie4Science opened 3 months ago
Hi @Yingjie4Science, thanks for checking. I believe the reason this syntax doesn't work is that the use of after_stat()
controls the entire expression passed to label
, not just the part contained in after_stat()
. The variable survey
is not preserved by StatFlow
, so it's not recognized. Instead, you can use the variable x
, to which survey
is passed, though since x
is made numeric you'll need to know what number it corresponds to:
library(ggalluvial)
#> Loading required package: ggplot2
# rightward flow aesthetics for vaccine survey data, with cubic flows
data(vaccinations)
vaccinations$response <- factor(vaccinations$response,
rev(levels(vaccinations$response)))
# annotate fixed-width ribbons with counts
ggplot(vaccinations,
aes(x = survey, stratum = response, alluvium = subject,
weight = freq, fill = response)) +
geom_lode() + geom_flow(curve_type = "cubic") +
geom_stratum(alpha = 0) +
geom_text(
stat = "flow",
aes(
label = ifelse(x == 3, after_stat(n), NA),
hjust = (after_stat(flow) == "to")
)
)
#> Warning: Removed 44 rows containing missing values or values outside the scale range
#> (`geom_text()`).
Created on 2024-06-27 with reprex v2.1.0
Maybe it would be worthwhile to have the Stat*
s preserve the variables passed to aesthetics. I'll leave this issue open as a reminder to try that.
Thank you @corybrunson ! It's good to know that x is made numeric.
I have two follow-up questions related to the annotations.
x == 2
? ggplot(vaccinations,
aes(x = survey, stratum = response, alluvium = subject,
weight = freq, fill = response)) +
geom_lode() + geom_flow(curve_type = "cubic") +
geom_stratum(alpha = 0) +
geom_text(
stat = "flow",
aes(
# label = ifelse(x == 2, after_stat(n), NA),
label = ifelse(x == 2, scales::percent(after_stat(prop), accuracy = 0.1), NA),
hjust = (after_stat(flow) == "to")
)
)
Hi @Yingjie4Science, i think (1) can be done by additionally conditioning the labels on after_stat(flow) == "to"
(or against after_stat(flow) == "from"
). Please report back on whether that works, or i can try it later.
I don't think (2) has a straightforward solution. It might also be something to implement as an additional computed variable, maybe stratum_count
or just sum
for the total of count
within each stratum?
Hi @corybrunson Thanks! The first solution works perfectly.
I am still struggling with (2) - are you suggesting we add an extra column to the dataframe? I am not sure how to call that data and use it in the label
argument
@Yingjie4Science i apologize, i think i lost track of this exchange as other obligations piled up.
Regarding (2), i tried to write up my own understanding of computed variables here. Please let me know if the idea is clear. My proposal for (2) is then to add a new computed variable for within-stratum sums or proportions. This could be done quickly; i just need to think through the conventions (i.e. what to call these new columns) and consequences (i.e. make sure they don't introduce backward incompatibilities).
Hi @corybrunson I have a similar but slightly different question: in your last example here, is it possible to only show the labels on the last axis, i.e., "ms460_NSA".
I have tried
label = ifelse(survey == "ms460_NSA" & after_stat(n)>10, after_stat(n), NA))
, but with an error "object 'survey' not found"Originally posted by @Yingjie4Science in https://github.com/corybrunson/ggalluvial/issues/114#issuecomment-2190916709