etiennebacher / tidypolars

Get the power of polars with the syntax of the tidyverse
https://tidypolars.etiennebacher.com
Other
172 stars 3 forks source link

Renaming in `select()` #117

Closed etiennebacher closed 4 months ago

etiennebacher commented 4 months ago

I catched another little trick you don't support, let me know if it is asking too much or if I should open a new issue. In dplyr::select() one can select AND rename at the same time, such as:

select(t1, x, why = y)
# A tibble: 2 × 2
x       why
<chr> <int>
1 a         1
2 b         2

with tidypolars, we need to select(t1, x, y) |> rename(why = y)

Originally posted by @ginolhac in https://github.com/etiennebacher/tidypolars/issues/116#issuecomment-2115555458

@ginolhac I prefer opening a new issue for this one, but of course you can open as many as you want :)


library(tidypolars)
library(dplyr, warn.conflicts = FALSE)

tibble(x = 1, y = 2) |> 
  select(foobar = x)
#> # A tibble: 1 × 1
#>   foobar
#>    <dbl>
#> 1      1

tibble(x = 1, y = 2) |> 
  as_polars_df() |> 
  select(foobar = x)
#> Error: Execution halted with the following contexts
#>    0: In R: in $select()
#>    1: Encountered the following error in Rust-Polars:
#>          not found: foobar
#> 
#>       Error originated just after this operation:
#>       DF ["x", "y"]; PROJECT */2 COLUMNS; SELECTION: "None"
ginolhac commented 4 months ago

Yes, a new issue is better, my point was more to which extend do you want to translate tidyverse functions. This renaming and named bind_rows() are part of my std pipelines but I could also adapt. BTW I tested duckplyr on the same analysis and it was interesting to compare. More next time.

etiennebacher commented 4 months ago

my point was more to which extend do you want to translate tidyverse functions

Ideally all of them. Practically this will not be possible for some time because a bunch of them (in particular in tidyr) are not generics, e.g #57. The objective is that the only polars-specific step a user does is importing the data using the scan/read function and collect the result. Everything in the middle should be some standard R / tidyverse code.

Eventually, we might also extend the translations to other packages supported by tidyverse or the r-lib team, such as clock or slider, but 1) we'll have to stop at some point (there are just too many packages) and 2) I tend to focus on the functions I use the most and or that are quite standard so supporting those packages is low-priority for me.

etiennebacher commented 4 months ago

Closed by 43db0e8:

library(dplyr, warn.conflicts = FALSE)
library(tidypolars)

tibble(x = 1, xx = 2) |> 
  as_polars_df() |> 
  select(foobar = x)
#> shape: (1, 1)
#> ┌────────┐
#> │ foobar │
#> │ ---    │
#> │ f64    │
#> ╞════════╡
#> │ 1.0    │
#> └────────┘

tibble(x = 1, xx = 2) |> 
  as_polars_df() |> 
  select(foobar = contains("x"))
#> shape: (1, 2)
#> ┌─────────┬─────────┐
#> │ foobar1 ┆ foobar2 │
#> │ ---     ┆ ---     │
#> │ f64     ┆ f64     │
#> ╞═════════╪═════════╡
#> │ 1.0     ┆ 2.0     │
#> └─────────┴─────────┘