tidyverse / dtplyr

Data table backend for dplyr
https://dtplyr.tidyverse.org
Other
670 stars 57 forks source link

Support tidyselect #447

Closed vRadAdamBender closed 1 year ago

vRadAdamBender commented 1 year ago

With dtplyr not all tidyselect verbs are supported. Can this be implemented?

Error in across(): ! This tidyselect interface doesn't support predicates. Backtrace:

markfairbanks commented 1 year ago

where() doesn't work with dtplyr (or dbplyr) since they work lazily. This means columns don't necessarily exist until collect() is called. Unfortunately there's no way to add this functionality, either, hence the error message ! This tidyselect interface doesn't support predicates.

It's also worth noting that other tidyselect functions work just fine - starts_with(), ends_with(), etc.

library(dtplyr)
library(dplyr, warn.conflicts = FALSE)

iris %>%
  lazy_dt() %>%
  select(starts_with("S"))
#> Source: local data table [150 x 3]
#> Call:   `_DT1`[, .(Sepal.Length, Sepal.Width, Species)]
#> 
#>   Sepal.Length Sepal.Width Species
#>          <dbl>       <dbl> <fct>  
#> 1          5.1         3.5 setosa 
#> 2          4.9         3   setosa 
#> 3          4.7         3.2 setosa 
#> 4          4.6         3.1 setosa 
#> 5          5           3.6 setosa 
#> 6          5.4         3.9 setosa 
#> # ℹ 144 more rows
#> 
#> # Use as.data.table()/as.data.frame()/as_tibble() to access results

Hope this helps! If you have any questions let me know.

yhm-amber commented 6 months ago

Is there any way to add a function named such as is_type, then for code such as this,

data.table::as.data.table(iris) |> 
  dtplyr::lazy_dt(immutable = F) |> 
  dplyr::mutate(dplyr::across(tidyselect::where(is.double), round)) |> 
  data.table::as.data.table()

we can replace tidyselect::where(is.double) to tidyselect::is_type("double"), then it will not end with a doesn't support predicates error.

Is that possible or already have function like this ?


Now I replace tidyselect::where(is.double) to tidyselect::all_of((\ (.x) names(.x)[.x]) (unlist(iris |> head(0) |> lapply (is.double)))) to make it work.

That tidyselect::all_of here have same effect with tidyselect::any_of or tidyselect::one_of.

But those are not very declarative.