markfairbanks / tidytable

Tidy interface to 'data.table'
https://markfairbanks.github.io/tidytable/
Other
450 stars 32 forks source link

`cur_column()` inside a lambda #699

Closed Darxor closed 1 year ago

Darxor commented 1 year ago

Weirdly, cur_column() works in a formula-style lambda, but doesn't work when function is declared traditionally. It works with {dplyr} as intended - both in functions' signature as a default argument, and inside a function's body.

library(tidytable)
df <- data.frame(a = 1, b = 2)

df |> 
  mutate(
    across(
      everything(),
      ~ cur_column() 
    )
  )
#> # A tidytable: 1 × 2
#>   a     b    
#>   <chr> <chr>
#> 1 a     b

df |> 
  mutate(
    across(
      everything(),
      \(x) cur_column() 
    )
  )
#> Error in `cur_column()`:
#> ! cur_column() should only be used inside across()

#> Backtrace:
#>      ▆
#>   1. ├─tidytable::mutate(df, across(everything(), function(x) cur_column()))
#>   2. │ ├─tidytable::mutate.(...)
#>   3. │ └─tidytable:::mutate..data.frame(...)
#>   4. │   └─tidytable::mutate(...)
#>   5. │     ├─tidytable::mutate.(...)
#>   6. │     └─tidytable:::mutate..tidytable(...)
#>   7. │       └─rlang::eval_tidy(dt_expr, .df, dt_env)
#>   8. ├─...[]
#>   9. └─data.table:::`[.data.table`(...)
#>  10.   └─base::eval(jsub, SDenv, parent.frame())
#>  11.     └─base::eval(jsub, SDenv, parent.frame())
#>  12.       ├─vctrs::vec_recycle((function(x) cur_column())(a), .N)
#>  13.       └─(function(x) cur_column())(a)
#>  14.         └─tidytable::cur_column()
#>  15.           └─rlang::abort("cur_column() should only be used inside across()")

df |> 
  dplyr::mutate(
    dplyr::across(
      everything(),
      \(x) dplyr::cur_column()
    )
  )
#>   a b
#> 1 a b

Created on 2022-11-30 with reprex v2.0.2

EDIT: I should add, this behavior also extends to all function calls, not just lambdas.

This will also work in {dplyr}

my_fn <- function(...) cur_column()
df |> 
  mutate(
    across(
      everything(),
      my_fn
    )
  )
moutikabdessabour commented 1 year ago

I think this is due to the current way tidytable handles the special functions, in layman's terms it replaces the call to the special function with the data.table equivalent. see utils-prep_exprs.R for the full implementation details.

If there is a way to peek into the function that's sent then somehow this can be solved.

markfairbanks commented 1 year ago

prep_exprs() sends the across() work to expand_across() which can be found in utils-across.R.

There's an internal helper in there called replace_cur_column() that is missing this case. I also just noticed that anonymous functions in across() aren't getting translated at all.

pacman::p_load(tidytable)

df <- data.frame(a = 1, b = 2)

df %>%
  summarize(
    across(everything(), \(x) n())
  )
#> Error in `n()`:
#> ! n() should only be used inside tidytable verbs
markfairbanks commented 1 year ago

All set, thanks for catching this.

library(tidytable, w = FALSE)

df <- data.frame(a = 1, b = 2)

df |> 
  mutate(
    across(
      everything(),
      \(x) cur_column()
    )
  )
#> # A tidytable: 1 × 2
#>   a     b    
#>   <chr> <chr>
#> 1 a     b

df |> 
  mutate(
    across(
      everything(),
      \(x) n()
    )
  )
#> # A tidytable: 1 × 2
#>       a     b
#>   <int> <int>
#> 1     1     1
markfairbanks commented 1 year ago

I just realized I forgot to mention something one other thing. This example you gave you can't work in tidytable.

my_fn <- function(...) cur_column()
df |> 
  mutate(
    across(
      everything(),
      my_fn
    )
  )

Context functions (like n() or cur_column()) won't work if hidden inside another function like my_fn().