Open AntonKjellberg opened 1 month ago
Hi @AntonKjellberg, thanks for raising the issue.
Looking back, i think the query functions is_alluvia_form()
and is_lodes_form()
need to be better documented and their parameters overhauled to match the aesthetic mappings. Here's the check you want to run, based on the aesthetic mappings you've specified:
is_lodes_form(toy, key = time, value = genus, id = id)
When i run it, i get the following message:
#> Duplicated id-axis pairings.
#> [1] FALSE
So, the problem is that some values of id
appear with the same value of time
more than once, which is not allowed in an alluvial plot. In fact, there are many such duplications:
#> count(toy, time, id)
#> # A tibble: 6 × 3
#> time id n
#> <chr> <dbl> <int>
#> 1 1m 1 11
#> 2 1m 2 11
#> 3 1w 1 11
#> 4 1w 2 11
#> 5 3m 1 11
#> 6 3m 2 11
You'll need to think carefully about what information you want to convey in the plot. What are the individuals or groups (alluvium
) that you want to track across multiple measurements (x
), and what values can they take (stratum
)? Is there a plot in the examples that is similar to what you want?
Thank you for your reply, Cory
That makes sense! Unfortunately, I still struggle to display the data how I want.
I want a plot like this where streams connect the blocks based on the genus abundance within the different ids
ggplot(toy, aes(x = time, stratum = genus, y = abundance, fill = genus)) +
geom_stratum()
This plot represents the same overall structure, but the data wasn't available. (wave as time, n as abundance, key as genus, and alluvium id)
https://longitudinalanalysis.com/visualizing-transitions-in-time-using-r-and-alluvial-graphs/
I couldn't find a similar plot in the examples
Hi @AntonKjellberg—notice from the source that the second plot is based on an id
variable derived from a row index when the data were in wide (or "alluvia") form, which is why each value of id
only appears once in the same row with any value of wave
. In your data, id
is manually defined to be several repetitions of only two values, which would only allow for two alluvia in the plot. You'll need a different identifier if you want a similar plot; since i don't know the provenance of your data i don't want to speculate on how it's structured, and therefore how the identifiers should be defined.
Hi!
What an amazing package.
I'm trying to display how the mean relative abundance of different bacterial genera develops over time. I filtered for IDs that have samples for all three time points; however, I still can't get it to work.
Here is a fraction of the dataset. abundance has one entry for every combination of time(3), id(only 2 here) and genus(11). 3x2x11=66 in total.
Error in
geom_stratum()
: ! Problem while computing stat. ℹ Error occurred in the 1st layer. Caused by error insetup_data()
: ! Data is not in a recognized alluvial form (seehelp('alluvial-data')
for details). Runrlang::last_trace()
to see where the error occurred.