Tangling with dots

I wrote this for Advanced R, but it doesn't feel quite right there. I think it might be better in the programming with dplyr vignette (whatever that ends up being)

In our grouped_mean() example above, we allow the user to select one grouping variable, and one summary variable. What if we wanted to allow the user to select more than one? One option would be to use .... There are three possible ways we could use ... it:

Pass ... onto the mean() function. That would make it easy to set na.rm = TRUE. This is easiest to implement.
Allow the user to select multiple groups
Allow the user to select multiple variables to summarise.

Implementing each one of these is relatively straightforward, but what if we want to be able to group by multiple variables, summarise multiple variables, and pass extra args on to mean(). Generally, I think it is better to avoid this sort of API (instead relying on multiple function that each do one thing) but sometimes it is the lesser of the two evils, so it is useful to have a technique in your backpocket to handle it.

grouped_mean <- function(df, groups, vars, args) {

  var_means <- map(vars, function(var) expr(mean(!!var, !!!args)))
  names(var_means) <- map_chr(vars, expr_name)

  df %>%
    dplyr::group_by(!!!groups) %>%
    dplyr::summarise(!!!var_means)
}

grouped_mean(mtcars, exprs(vs, am), exprs(hp, drat, wt), list(na.rm = TRUE))

If you use this design a lot, you may also want to provide an alias to exprs() with a better name. For example, dplyr provides the vars() wrapper to support the scoped verbs (e.g. summarise_if(), mutate_at()). aes() in ggplot2 is similar, although it does a little more: requires all arguments be named, naming the the first arguments (x and y) by default, and automatically renames so you can use the base names for aesthetics (e.g. pch vs shape).

grouped_mean(mtcars, vars(vs, am), vars(hp, drat, wt), list(na.rm = TRUE))

Exercises

Implement the three variants of grouped_mean() described above:

# ... passed on to mean
grouped_mean <- function(df, group_by, summarise, ...) {}
# ... selects variables to summarise
grouped_mean <- function(df, group_by, ...) {}
# ... selects variables to group by
grouped_mean <- function(df, ..., summarise) {}

tidyverse / tidyeval

Tangling with dots #24

Tangling with dots

Exercises