tidyverse / dplyr

dplyr: A grammar of data manipulation
https://dplyr.tidyverse.org/
Other
4.75k stars 2.12k forks source link

Referentially transparent column names in `dplyr::filter` #3138

Closed jankowtf closed 6 years ago

jankowtf commented 6 years ago

I would like to make use of quo()/enquo() and !! in calls to dplyr::filter in order to make the column name referentially transparent much in the same way that is supported by dplyr::mutate.

Ideally, I'd like to end up with something like this (pseudo code): df %>% filter(!!<reference> :== <value>

library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
set.seed(89234)
df <- data.frame(id = rep(1:2, 3), value = rpois(6, 10))

c_id <- as.name("id")
c_value <- as.name("value")
v_id <- 1

# Example of my usage of `mutate` -----
my_multiply <- function(x, by) x * by
df %>% mutate(!!c_value := my_multiply(!!c_value, 10))
#>   id value
#> 1  1   140
#> 2  2   110
#> 3  1    90
#> 4  2   120
#> 5  1    90
#> 6  2    50

# Trying to do something similar with `filter` -----

# Trying to construct a call to `dplyr::quo` or `dplyr::enquo` where the left
# part contains the **evaluated** reference of the column name while the right
# part contains the **non-evaluated* reference of the logical query to be
# evaluated:

my_filter <- function(x, left, right) {
  quo_expr <- quo(quo(!!left) >= right)
  # quo_expr <- quo(!!quo(left) >= right)
  print(quo_expr)
  x %>% filter(!!quo_expr)
}
my_filter(df, c_id, v_id)
#> <quosure: frame>
#> ~quo(id) >= right
#> [1] id    value
#> <0 rows> (or 0-length row.names)
lionel- commented 6 years ago

Could you try with:

left <- enquo(left)
quo_expr <- quo((!! left) >= right)
jankowtf commented 6 years ago

@lionel- then I end up with this:

my_filter <- function(x, left, right) {
  # quo_expr <- quo(quo(!!left) >= right)
  # quo_expr <- quo(!!quo(left) >= right)
  left <- enquo(left)
  quo_expr <- quo((!! left) >= right)
  print(quo_expr)
  x %>% filter(!!quo_expr)
}
v_id <- 1
my_filter(df, c_id, v_id)
#> <quosure: frame>
#> ~(~c_id) >= right
#>   id value
#> 1  1    14
#> 2  2    11
#> 3  1     9
#> 4  2    12
#> 5  1     9
#> 6  2     5
lionel- commented 6 years ago

Is the following what you're trying to achieve?

my_filter <- function(x, left, right) {
  left <- enquo(left)
  right <- enquo(right)
  filter(x, (!! left) >= (!! right))
}

my_filter(df, !! c_id, !! c_value)
jankowtf commented 6 years ago

@lionel- sorry, in the initial run of reprex() in the original post I overlooked an error, I've updated it

jankowtf commented 6 years ago

Great, thanks!

I think it's easier to grasp when using == instead of >= in my example (sorry, not my best reprex-day today): I wanted a more flexible version of df %>% filter(id == 1) and yours works perfectly!

This would be the equivalent call then: my_filter(df, !! c_id, !! v_id)

lionel- commented 6 years ago

Great!

By the way, stackoverflow (where I follow the tidyeval tag) or https://community.rstudio.com are better venues for this type of question.

jankowtf commented 6 years ago

Thanks, noted. I'll have a look at the tag.

Seems like it actually "only" boils down to using parentheses!

df %>% filter(!! c_id == v_id) # fails
df %>% filter((!! c_id) == v_id) # works
lionel- commented 6 years ago

yes ! has low operator precedence so it basically captures everything to its right.