Open bakaburg1 opened 7 months ago
I noticed a possibly related problem from this Stack Overflow question. This code results in an error
iris |>
group_nest(Species) |>
mutate(boo = map(data, function(x) {
colsym <- sym("Sepal.Length")
x %>% mutate(newcol =! !colsym)
}))
# Error: object 'colsym' not found
Same issue if using .data
iris |>
group_nest(Species) |>
mutate(boo = map(data, function(x) {
colname <- "Sepal.Length"
x %>% mutate(uncle=.data[[colname]])
}))
But if you define the function first rather than inline, it will run
helper <- \(x) {
colsym <- sym("Sepal.Length")
x %>% mutate(newcol=!!colsym)
}
iris |>
group_nest(Species) |>
mutate(data = map(data, helper))
Looking at the trace, it seems the problem is actually coming from rlang::quos
. Something else that will trigger the error is
quos(\(x) {bee <- colsym::sym("a"); mutate(x, newcol=!!colsym)})
Basically the !!
part is being evaluated when defining the function, not when calling the function. Is there a way to delay the evaluation of the !!
or .data[[]]
when applied to functions? Tested with rlang_1.1.3
, purrr_1.0.2
, dplyr_1.1.4
You'd have the same issue if you nest bquote()
calls, the .()
are substituted by the outside calls, similar if you nest substitute()
calls.
You can use the following trick:
protect <- function(expr) call("!", call("!", substitute(expr)))
rlang::expr(!!protect(hello))
#> !!hello
library(tidyverse)
#> Warning: package 'ggplot2' was built under R version 4.2.3
#> Warning: package 'stringr' was built under R version 4.2.3
iris |>
group_nest(Species) |>
mutate(boo = map(data, function(x) {
colsym <- sym("Sepal.Length")
x %>% mutate(newcol = !!protect(colsym))
}))
#> # A tibble: 3 × 3
#> Species data boo
#> <fct> <list<tibble[,4]>> <list>
#> 1 setosa [50 × 4] <tibble [50 × 5]>
#> 2 versicolor [50 × 4] <tibble [50 × 5]>
#> 3 virginica [50 × 4] <tibble [50 × 5]>
For data it's a bit different I think, in the given examples .data shouldn't be used, it's not to be considered as an object in scope, but as a special operator at the top level for mutate, it's not very clear from the dot though. It seems a macro is run (so if you set a browser() in the function it won't be triggered for instance) when we use .data, where using means calling it with brackets. This works around it :
iris |>
group_nest(Species) |>
mutate(boo = map(data, ~ {
colname <- "Sepal.Length"
.x %>% mutate(uncle= (.data)[[colname]])
}))
#> # A tibble: 3 × 3
#> Species data boo
#> <fct> <list<tibble[,4]>> <list>
#> 1 setosa [50 × 4] <tibble [50 × 5]>
#> 2 versicolor [50 × 4] <tibble [50 × 5]>
#> 3 virginica [50 × 4] <tibble [50 × 5]>
Hello,
This error was driving me crazy and took me a while to isolate it. if one wants to use
.data
into a lambda function into across, you cannott index .data with variables created into the lambda function itself, otherwise R will complain that such variable doesn't exist!example:
Which was driving me crazy, since the variable clearly exists in the scope.
but if
other_col_name
is defined outside no problem:If this is by design and you don't plan to fix it, could be useful to have a clearer error and some extra documentation somewhere!
There cases in which for example the index name is defined dynamically based on the x or cur_column() value. Now that I now the issue I'll use pick()[1], unless you have better solutions.