tidyverse / ggplot2

An implementation of the Grammar of Graphics in R
https://ggplot2.tidyverse.org
Other
6.51k stars 2.02k forks source link

Reference lines break facet_wrap when facets are the product of a function #2963

Closed felipegerard closed 3 years ago

felipegerard commented 6 years ago

I want to add some reference lines to a faceted plot. The lines should be the same in every facet. I want to use a custom value not mapped from the data. It works just fine as long as I use plain variables for faceting, but when I mix or alter them, I get an error, which depends on the operation I perform.

I think the problem lies in the fact that the plot wants to compute facets with for the non-mapped layer, but the data doesn't exist. I don't think geom_line has a problem, but rather facet_wrap.

## Example 1 ------------------------------
## Works
ggplot(mtcars, aes(mpg, cyl)) +
  geom_point() +
  geom_vline(xintercept = 20) +
  facet_wrap(vars(am))

## Doesn't work
ggplot(mtcars, aes(mpg, cyl)) +
  geom_point() +
  geom_vline(xintercept = 20) +
  facet_wrap(vars(2 * am)) # <-- there's an operation involved

## Example 2 ------------------------------
## Works
ggplot(mtcars, aes(cyl)) +
  geom_bar() +
  geom_vline(xintercept = 4.5) +
  facet_wrap(vars(am, vs))

## Doesn't work
ggplot(mtcars, aes(cyl)) +
  geom_bar() +
  geom_vline(xintercept = 4.5) +
  facet_wrap(vars(paste(am, vs))) # <-- there's an operation involved

## Easiest fix I can think of: move the operation to a mutate
mtcars %>% 
  mutate(facet = paste(am, vs)) %>% 
  ggplot(aes(cyl)) +
  geom_bar() +
  geom_vline(xintercept = 4.5) +
  facet_wrap(vars(facet))
batpigandme commented 6 years ago

Adding a reprex. It's helpful to have the code as well as the output with error messages. The error message is, in effect, the same, that the first object mentioned in the operation (in this case am) cannot be found. Also added an example of the same code with the operation but without the vline, which also works.

library(ggplot2)
library(dplyr)
## Example 1 ------------------------------
## Works
ggplot(mtcars, aes(mpg, cyl)) +
  geom_point() +
  geom_vline(xintercept = 20) +
  facet_wrap(vars(am))


## Doesn't work
ggplot(mtcars, aes(mpg, cyl)) +
  geom_point() +
  geom_vline(xintercept = 20) +
  facet_wrap(vars(2 * am)) # <-- there's an operation involved
#> Error in rlang::eval_tidy(facet, data, env): object 'am' not found

## Works without `geom_vline()`
ggplot(mtcars, aes(mpg, cyl)) +
  geom_point() +
  facet_wrap(vars(2 * am)) 

Created on 2018-10-25 by the reprex package (v0.2.1.9000)

felipegerard commented 6 years ago

I didn't know the reprex package. It's very cool! I can't use it for images, though, because my work firewall has banned imgur...

paleolimbot commented 5 years ago

This is also the case when using tidyeval in vars():

library(ggplot2)
ggplot(mtcars, aes(mpg, cyl)) +
  geom_point() +
  geom_vline(xintercept = 20) +
  facet_wrap(vars(.data$am))
#> Column `am` not found in `.data`
paleolimbot commented 5 years ago

Leaving this here to ponder...I don't know if there's an existing way to inspect a quosure to find which objects in data it refers to.

https://github.com/tidyverse/ggplot2/blob/4ce39539a00ec18fe145bc9ab8bfb8c27b116a2c/R/facet-.r#L424-L437

Eisit commented 5 years ago

Hi, I had the same problem and I solved it through adding its own aesthetic to the geom_vline: geom_vline(xintercept = 20) => geom_vline(aes(xintercept = 20))

I can't explain why, though, as the current help file for geom_vline does not mention it. I only tried it because I found aes() added in geom_vline in old stackoverflow answers.

yutannihilation commented 4 years ago

I don't know if there's an existing way to inspect a quosure to find which objects in data it refers to.

A quick note. gghighlight had a similar issue and fixed in https://github.com/yutannihilation/gghighlight/pull/78 by ignoring the failed evaluations (and stops only when the all calculations fail). I think the fix would be the same here because the problem is not whether the symbols in the expression refer to some data but whether the expression can be successfully evaluated (e.g. am might be an variable that defines on the global environment). So, I think there's no way to know if the evaluation will success or not other than actually evaluating it.

paleolimbot commented 4 years ago

It's been a long time since I looked into this, but I did at one point write code that figures out which variables from a quosure are referred to. It's been too long since I looked at the facet code to know if something like that is needed or whether an evaluation check would work, but I do think this should be fixed, as .data$column_name is valid (even encouraged) usage.

yutannihilation commented 4 years ago

Nice work, thanks. I think your code covers most of the cases. Still, I concern

  1. there probably are some corner cases; e.g. not_a_column * a_column_not_included_in_all_layers should be evaluated on the layers containing the column, but the current logic (IIUC, it's "if the expression containing any non-column variable, return NULL because the evaluation would fail") probably won't.
  2. mimicking the spec of .data might lead to some difficulties when some breaking change is introduced on tidyeval's side. (That said, this seems unlikely to happen. So, I think we can ignore this point).

If tryCatch()ing the evaluation is too expensive, I agree with polishing the static analysis of the expression, but I'm not sure at the moment...

paleolimbot commented 4 years ago

I'm not either...these are all good points. The static analysis is more complicated than I would like for this problem. We could special case .data$ and .data[[ to work with the current behaviour, but it wouldn't solve the problem in this issue!

clauswilke commented 4 years ago

If I understand correctly, the general problem is that eval_tidy() simply fails when a variable can't be looked up, when sometimes we would want to have a default value be substituted instead. I've run into this problem also, in the context of customizable key glyphs. I worked around it defining this function: https://github.com/wilkelab/cowplot/blob/06aeeb449ccc0fb343eb03c83111df7b171e73af/R/key_glyph.R#L145-L154

library(rlang)

eval_default <- function(x, data, default) {
  force(default)

  suppressWarnings(
    tryCatch(
      error = function(e) default,
      eval_tidy(x, data)
    )
  )
}

data <- data.frame(x = 1)
eval_default(quo(x), data, NULL)
#> [1] 1
eval_default(quo(y), data, NULL)
#> NULL

Created on 2020-01-12 by the reprex package (v0.3.0)

What I found strange is that it needed the suppressWarnings() part. Without it I was getting weird warnings. I don't recall now what they were, and they would only arise when I was using the function inside ggplot2, so it's not easily reproducible.

In any case, I wonder whether something like this eval_default() function should be part of rlang, and if I should file an issue.

yutannihilation commented 4 years ago

@paleolimbot Ah, after looking at #3346, now I see your context. To issue a user-friendly warning, we need to do a static analysis even if it's complicated, as well as trying actually evaluating it. It seems this issue is also the case where both are needed.

@clauswilke My understanding is the same. I agree with filing an issue on rlang's repo. Will you?

clauswilke commented 4 years ago

Yes, I'll file an issue.

I remember the warning now. It was "restarting interrupted promise evaluation." That seemed sufficiently scary that I didn't just want to ignore it, but there doesn't seem to be any actual problem associated with this warning.

clauswilke commented 4 years ago

Issue submitted: https://github.com/r-lib/rlang/issues/888

yutannihilation commented 4 years ago

Thanks!

felipegerard commented 4 years ago

Thank you all for your hard work! I honestly had no idea that the rabbit hole would go this deep. I'm out of my depth but you guys seem to be on top of things :)