Open thomasp85 opened 5 years ago
I think this is a good idea. I very much adhere to the philosophy that we should optimize the API so we don't usually have to provide the dataset more than once. This is in line with @yutannihilation's gghighlight and my ideas for in-layer sampling in the ungeviz package.
I'm just not fully convinced about static
. A good name would make sense in the context of facets first. I can't think of anything better, though.
I couldn't agree this more! Currently, gghighlight needs ugly colnames (you cannot always remove the column as it might be mapped to a necessary aes) to prevent the unhighlighted data from being facetted:
completely open to another name... too deep in animation at the moment to be able to think of something better though (except for repeat
which I don't particularly like)
How would we handle the repetition when user wants to repeat along rows only or columns only ? below is an example to calrify why we might want to do this. I think of it as if it was a margin but instead of having it in a separate panel we want it to be in the background so each panel becomes a "highlighted layer"
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
library(ggplot2)
mtcars_plot <- mutate(mtcars,
cyl = as.factor(cyl),
vs = as.factor(vs),
gear = as.factor(gear))
mtcars_static <- mutate(mtcars_plot, gear = NULL, vs = NULL)
mtcars_static2 <- mutate(mtcars_plot, gear = NULL)
ggplot(mtcars_plot, aes(x = cyl )) +
geom_bar(data = mtcars_static, fill = 'grey70') +
geom_bar() +
facet_grid(cols= vars(gear), rows = vars(vs), margins=TRUE)
ggplot(mtcars_plot, aes(x = cyl )) +
geom_bar(data = mtcars_static2, fill = 'red') +
geom_bar() +
facet_grid(cols= vars(gear), rows = vars(vs), margins=TRUE)
Created on 2019-01-09 by the reprex package (v0.2.1)
The old approach would still work, but this would solve 99% of the needs in a more elegant way
common = TRUE
?
My brain woke up again... fixed
is obviously the right argument name, and already part of the lingua ggplot2a
I don't think the argument have to be as generic as one word, since this is probably for expert-use. What about something more specific like ignore_facet
or allow_facet
, if we allow users to choose which variables the data is static on?
ggplot(diamonds, aes(x = color)) +
geom_bar(fill = 'grey70', ignore_facet = TRUE) +
geom_bar() +
facet_wrap(~cut)
ggplot(mtcars_plot, aes(x = cyl )) +
geom_bar(fill = 'red', ignore_facet = vars(gear)) +
geom_bar() +
facet_grid(cols= vars(gear), rows = vars(vs), margins=TRUE)
If this doesn't make sense, I'll vote for @thomasp85's fixed
.
I would really like something that doesn’t mention facet as I want to use it in gganimate as well
Ah, sorry, my understanding was wrong about the context and how it's going to be implemented. Is this about whether to respect PANEL
or not around here? (I thought this is about the implementation of Facet
) If so, I agree with fixed
.
I haven't done any POC yet, but in general this should be a way for layers to broadcast that they are part of the "background", and not to be split up. It will certainly require some changes to the different facet implementations...
I just had the bright idea of allowing this argument to be either a logical or character vector defining what type of fixed the layer it should be (it will be up to the different facet/gganimate/extensions) to define how to interpret the strings, but e.g. something like fixed = c('rows', 'frames')
would repeat the data on the rows (as above) and during animation...
isn't rows/cols only meaningful for facet_grid as in the facet_wrap above the "row" is two rows. User will still have to make sure he specifies the order of layers right and maybe with the new custom aesthetics user might want to have specific separate scales for these "background"/"reference" visual elements so we can have a kind of a legend defining what they are e.g. above we would have a fill legend with gray area for background data and names as such then we might want another legend with fill to identify the regular fill mapping.
Thanks for clarification, it makes sense.
e.g. something like
fixed = c('rows', 'frames')
would repeat the data on the rows (as above) and during animation...
I feel row
and frame
are too specific then.
The user might want to control whether the data is split over
This is pretty complicated if we discuss all of them at once...
I don't think the name of the variable should be mixed into this... that is part of the facet spec.
I'm envisioning multiple valid strings to be used
facet
will fix it during faceting (but not in animation or whatever else gets implemented)facet-row
will fix it over facet-rows in facet_grid
but be ignored otherwisefacet-col
like aboveframes
will fix it over the animation (ggplot2 shouldn't care about this of course - this is just logic I'll build into gganimate)The test in e.g. facet_grid for whether to fix the data across rows would be:
if (isTRUE(params$fixed) || params$fixed %in% c('facet', 'facet_row')) {
# do whatever needed to fix the data
}
that is part of the facet spec.
Ah, that's convincing. Thanks!
params$fixed %in% c('facet', 'facet_row')
I'm curious about this part. So, though frame
is suggested in doc, ggplot2 doesn't need to know how and by who that keyword will be used, right? If so, I agree with your idea.
I don’t think frame
should be mentioned in the ggplot2 docs. I just included it to show a non-facet use
OK, then it looks OK to me, I just worried that ggplot2 would have to know about the implementation of gganimate.
Ah, no. There need to be a strict separation IMO
I'm sorry for commenting on an issue last discussed here in january, but wouldn't a simple data subsetting wrapper produce the desired behaviour? Something along the lines of:
ggsubset <- function(rowtest = NULL, omit = NULL) {
rowtest <- substitute(rowtest)
if (is.null(rowtest)) {
rowtest <- substitute(TRUE)
}
omit <- substitute(omit)
if (is.null(omit)) {
omit <- substitute(TRUE)
}
function(x) subset.data.frame(x, eval(rowtest), -eval(omit))
}
Wherein rowtest
is a logical expression of which rows to keep (e.g. Species == "setosa"
in the iris dataset) and omit
is a column name you want to exclude for facetting purposes. It wouldn't store a complete data.frame
in the plot$layer[[...]]$data
slot, so it is more memory efficient than copying an extra diamonds_static <- mutate(diamonds, cut = NULL)
.
This would handle the diamond case as follows:
ggplot(diamonds, aes(x = color)) +
geom_bar(data = ggsubset(omit = cut), fill = 'grey70') +
geom_bar() +
facet_wrap(~cut)
Or the iris dataset:
ggplot(iris, aes(Sepal.Width, Sepal.Length)) +
geom_point(data = ggsubset(omit = Species), colour = "grey70") +
geom_point(aes(colour = Species)) +
facet_wrap(~Species)
The current approach to repeating a layer across panels is to not have the layer data contain the variable needed for the faceting. This is an approach also implemented in gganimate when it comes to having layers be static across the animation. While this works intuitively, I feel it often requires some additional dataprep before the plotting code, and sometimes require that layers that otherwise use the same data need to target separate almost identical datasets.
Would it make sense to add a
static
or perhapsrepeat
argument to geom and stat functions to explicitly mark them for being repeated across panels (and frames in the case of gganimate)?Example API:
The "problem" with the current approach is that it requires changes to the data source if we decide to change the faceting variable — not a huge problem, but still a barrier to experimentation.
If the
static
name is too close to the idea of animation, then we can figure out another name for it...