tidyverse / tidyeval

A guide to tidy evaluation
https://tidyeval.tidyverse.org
55 stars 21 forks source link

Tangling with dots #24

Open hadley opened 6 years ago

hadley commented 6 years ago

I wrote this for Advanced R, but it doesn't feel quite right there. I think it might be better in the programming with dplyr vignette (whatever that ends up being)

Tangling with dots

In our grouped_mean() example above, we allow the user to select one grouping variable, and one summary variable. What if we wanted to allow the user to select more than one? One option would be to use .... There are three possible ways we could use ... it:

Implementing each one of these is relatively straightforward, but what if we want to be able to group by multiple variables, summarise multiple variables, and pass extra args on to mean(). Generally, I think it is better to avoid this sort of API (instead relying on multiple function that each do one thing) but sometimes it is the lesser of the two evils, so it is useful to have a technique in your backpocket to handle it.

grouped_mean <- function(df, groups, vars, args) {

  var_means <- map(vars, function(var) expr(mean(!!var, !!!args)))
  names(var_means) <- map_chr(vars, expr_name)

  df %>%
    dplyr::group_by(!!!groups) %>%
    dplyr::summarise(!!!var_means)
}

grouped_mean(mtcars, exprs(vs, am), exprs(hp, drat, wt), list(na.rm = TRUE))

If you use this design a lot, you may also want to provide an alias to exprs() with a better name. For example, dplyr provides the vars() wrapper to support the scoped verbs (e.g. summarise_if(), mutate_at()). aes() in ggplot2 is similar, although it does a little more: requires all arguments be named, naming the the first arguments (x and y) by default, and automatically renames so you can use the base names for aesthetics (e.g. pch vs shape).

grouped_mean(mtcars, vars(vs, am), vars(hp, drat, wt), list(na.rm = TRUE))

Exercises

  1. Implement the three variants of grouped_mean() described above:

    # ... passed on to mean
    grouped_mean <- function(df, group_by, summarise, ...) {}
    # ... selects variables to summarise
    grouped_mean <- function(df, group_by, ...) {}
    # ... selects variables to group by
    grouped_mean <- function(df, ..., summarise) {}