Closed philipp-baumann closed 5 years ago
You can already use :=
to allow unquoting on the LHS of the ...
constructs in nest()
:
library(tidyr)
f <- function(df, vars_to_nest, new_col) {
nest(df, !!new_col := {{ vars_to_nest }})
}
f(iris, -Species, "stuff")
#> # A tibble: 3 x 2
#> Species stuff
#> <fct> <list<df[,4]>>
#> 1 setosa [50 × 4]
#> 2 versicolor [50 × 4]
#> 3 virginica [50 × 4]
Created on 2019-10-02 by the reprex package (v0.3.0.9000)
Is this 👆what you want to do?
BTW the new vignette In packages has some good background on related issues. It's possible an example of what we're doing in this thread should be added there 🤔
If you want your wrapper to "feel" the same as nest()
's ...
, then you can also use the "pass the dots" strategy.
This is a silly example but I just wanted to give the wrapper g()
some bit of logic besides calling nest()
:
library(tidyr)
g <- function(df, ...) {
names(df) <- tolower(names(df))
nest(df, ...)
}
g(iris, stuff = -species)
#> # A tibble: 3 x 2
#> species stuff
#> <fct> <list<df[,4]>>
#> 1 setosa [50 × 4]
#> 2 versicolor [50 × 4]
#> 3 virginica [50 × 4]
g(iris, petal = starts_with("petal"), sepal = starts_with("sepal"))
#> # A tibble: 3 x 3
#> species petal sepal
#> <fct> <list<df[,2]>> <list<df[,2]>>
#> 1 setosa [50 × 2] [50 × 2]
#> 2 versicolor [50 × 2] [50 × 2]
#> 3 virginica [50 × 2] [50 × 2]
Created on 2019-10-02 by the reprex package (v0.3.0.9000)
Hi Jenny, thanks for your quick response and the suggestions :+1:
hmm solution with g()
doesn't really apply here, and f()
does not work with multiple columns to (de)select. nest_wrapper()
was supposed to support multiple columns supplied by the user, which will be nested within a single list-column, using .key
or new_col
as in your example as name of the nested column.
The wrapper I had before did something as shown in vignette In packages (thanks for the hint):
library(tidyr)
nest_egg <- function(data, cols) {
nest(data, egg = one_of(cols))
}
nest_egg(iris, c("Petal.Length", "Petal.Width", "Sepal.Length", "Sepal.Width"))
#> # A tibble: 3 x 2
#> Species egg
#> <fct> <list<df[,4]>>
#> 1 setosa [50 × 4]
#> 2 versicolor [50 × 4]
#> 3 virginica [50 × 4]
Created on 2019-10-02 by the reprex package (v0.3.0)
The nest_wrapper()
does not require double quotes in contrary to the example above, and I have a lot of columns with chemical reference values to nest (it's always nice to save some typing) ;-). In the example, egg
is hard-coded. I'd find it nice to have a flexible quasiquotation solution.
I think it would be nice to have a tidy evaluation solution for a general use case like this for nest()
.
My example is a bit too big, so nest_wrapper()
exemplifies minimal behavior. In terms of code, it was like this for a use case (all arguments before .key
in the dots):
# Nest chemical reference data and sample group variables
spc_refdata_BDM <-
spc_refdata_BDM_unnested %>%
nest_keep_lcols(
As_tot, B_AAE10, BS, Ca_AAE10, CaCO3, Cd_tot, carbon_percent,
nitrogen_percent, CN, Corg, cTOC, cTOC_pool_20, TC_pool_20, TN_pool_20,
TS, Cu_AAE10, DNA_Menge, Fe_AAE10, Humus, K_AAE10, K_AAE10_GRUD,
KAKpot_cmol_kg, Mg_AAE10, Mg_AAE10_GRUD, Mn_AAE10, P_AAE10, P_AAE10_GRUD,
Pb_tot, U_tot, Zn_AAE10, pH, RG_FE, w_gFP, Sand, Schluff, Ton,
.key = "refdata")
I now have a solution, great you mentioned that :=
is working here. I'm happy with using dplyr::vars()̀
(like this the function interface is even becoming a bit cleaner because nest_cols
defines the intent, instead of less informative dots):
library(tidyr)
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
df <- tibble::tibble(
a = 1:10,
b = 1:10,
c = 1:10,
d = list(rep(1:10, 10))
)
nest_wrapper <- function(.data, nest_cols, new_col = "data") {
new_col <- rlang::enquo(new_col)
nest_cols_nm <- purrr::map_chr(nest_cols, rlang::as_name)
tidyr::nest(.data = .data, !!new_col := tidyselect::one_of(nest_cols_nm))
}
nest_wrapper(.data = df, nest_cols = vars(a, b, c), new_col = "refdata")
#> # A tibble: 1 x 2
#> d refdata
#> <list> <list<df[,3]>>
#> 1 <int [100]> [10 × 3]
Created on 2019-10-02 by the reprex package (v0.3.0)
I find
tidyr::nest()
very useful and use it quite heavily in my modeling workflows. I mostly do so using wrappers and quasiquotation/tidyeval.As of tidyr 1.0.0 the
.key
argument is depreciated and issues friendly warning. However, I couldn't figure out an equivalent solution to capture user input for the resulting nest list-column name without.key
.It seems that omitting
.key
would involve unquoting the left hand side provided in the new...
interface, for example unquotingdata
inc(data = c(x, y, z)
. The reprex is provided below the text.It would be neat if
nest()
supported this in the near future. Are there any plans to realize this, in a similar way as:=
indplyr::mutate()
? Or is there a workaround that allows wrapping!!!cols
into the equivalent ofc(data = c(x, y, z)
?Thanks a lot in advance for some hints. Cheers, Philipp
Created on 2019-10-02 by the reprex package (v0.3.0)