Closed HanjoStudy closed 6 years ago
For some reason it needs a group
aesthetic.
library(tidyverse)
library(ggridges)
gen_date_dist <- function(df_date){
data.frame(df_date, out = rnorm(1000,1,100))
}
## Generate random samples
df_ridge <- seq(as.Date("2010-01-01"), by = "month", length.out = 20) %>%
map(~.x %>% gen_date_dist) %>%
reduce(rbind) %>%
tbl_df
df_ridge %>%
ggplot(., aes(x = out, y = df_date, group = df_date)) +
geom_density_ridges()
#> Picking joint bandwidth of 22.3
Created on 2018-05-15 by the reprex package (v0.2.0).
Actually, this is in the documentation:
The grouping aesthetic does not need to be provided if a categorical variable is mapped onto the y axis, but it does need to be provided if the variable is numerical.
Hello, and thank you so much for ggridges, cowplot, and all your contributions to the R ecosystem!
It sounds like a numeric y will always generate an error - is that correct? If so, would it make sense to check for that condition and provide an error message telling the user that they need to convert y or specify group
if they want to use the numeric y?
If you think this is a good idea, I'm happy to take a stab if you can point me to the file where it should belong.
Thanks!
The first step would be to investigate what exactly the cause of the error is.
Good point! I think the fundamental issue is that ggplot needs to know that density should be calculated separately for each "group." Normally, it infers that groupiness based on whether the relevant variable is.discrete()
, which in our case is not true (see last bullet below).
For the details, here's what I've tracked down:
GeomDensityRidges' setup_data()
requires a y
value for the data passed to the transform()
call, but y
isn't present in that data frame if it is numeric in the source data frame so the transform()
call errors out
StatDensityRidges' compute_group()
seems to never return a data frame with a y
variable, but if there are multiple groups I think it is added back in by this mapply()
call that occurs after compute_group()
executes
If that mapply()
is what's adding y
back in it's not clear to me why it works when there are multiple values for data$group
but not when there is a single value, and I can't figure out how to drop into that compute_panel()
call to inspect further (presumably because of my lack of experience debugging ggproto methods).
For determining whether stats should be calculated separately by group:
compute_aesthetics()
assigns data$group
values with add_group()
. If group
is not specified in aes()
, add_group()
decides whether there are groups via is.discrete()
, so only factor or character vectors will have non-negative and >1 values for data$group
This can probably be solved somehow by reimplementing compute_panel()
in StatDensityRidges. To debug, you could copy that function over from ggplot2 into ggridges and then just add print statements to see what happens.
Ok, so the mapply()
call in compute_panel()
is adding back in variables that are constant within groups but were removed by compute_group()
. Since there aren't any group
s when y
is numeric its behavior isn't relevant for resolving this problem.
I think a solution will require explicitly making the code treat unique y
values a groups. For ridgeline plots y
must be categorical, so doing that doesn't require making assumptions about the user's intent.
I think the implementation options are more or less:
group
aestheticgroup
aesthetic based on the y
values very early in the code and let the plotting machinery work as normalcompute_panel()
to induce group-wise processing without using the group
aesthetic directlyI'm not sure which of these is the best option.
I'm hesitant to introduce code that works around assumptions made in the bowels of ggplot2. They might change at some point and then it's difficult to fix. And ignoring the group
aesthetic would also be bad. We still need to be able to group within y values, for example.
So all considered, I think adding an informative error is the right way to go at this time. It's easy to add a group aesthetic if we know we need one.
Sounds good - I will double-check whether it needs to be added for the other geoms as well.
Do you want me to submit a pull request, or would you prefer to add it yourself? If the former, where should the error check go - setup_data()
?
You can submit a pull request. The check should be right where the error occurs. Check whether there's a y
column in the data, and if not throw an error which includes a sentence such as "Did you forget to specify a group aesthetic?"
Error when using variable of
class
Date ony-axis
plotting densities.Without transformation we get an error:
In essence when we have a very dense
y-axis
we will want to scale the axis using:scale_y_date(date_labels = "%Y")
p.s tried reprex with multiple fails after reinstalling knitr, reprex, hope this is ok