tidyverse / dtplyr

Data table backend for dplyr
https://dtplyr.tidyverse.org
Other
670 stars 57 forks source link

`group_by` and then `mutate`/`summarise` with columns containing spaces fails #462

Closed ediferreira11 closed 9 months ago

ediferreira11 commented 9 months ago

When grouping a dataframe by a column that contains white spaces, mutate and summarise will error saying that the grouping column can't be found: Error in eval(bysub, x, parent.frame()) : object '`ah bh`' not found"

The problem seems to be on the code translation, since it gets "double" backticks:

copy(`_DT17`)[, `:=`(sum_col = sum(ch)), by = .(`\`ah bh\``)]

Reprex below:

test_df <- tibble::tibble(
  `ah bh` = rep(1:3, 2),
  ch = 1:6
)

query <- test_df |>
  dtplyr::lazy_dt() |> 
  dplyr::group_by(`ah bh`) |> 
  dplyr::mutate(
    sum_col = sum(ch)
  )

query |> dplyr::show_query()

query |> tibble::as_tibble()
markfairbanks commented 9 months ago

This is fixed in the dev version of dtplyr 😄

You can install it using pak::pak("tidyverse/dtplyr")

test_df <- tibble::tibble(
  `ah bh` = rep(1:3, 2),
  ch = 1:6
)

test_df |>
  dtplyr::lazy_dt() |> 
  dplyr::group_by(`ah bh`) |> 
  dplyr::mutate(
    sum_col = sum(ch)
  )
#> Source: local data table [6 x 3]
#> Groups: ah bh
#> Call:   copy(`_DT1`)[, `:=`(sum_col = sum(ch)), by = .(`ah bh`)]
#> 
#>   `ah bh`    ch sum_col
#>     <int> <int>   <int>
#> 1       1     1       5
#> 2       2     2       7
#> 3       3     3       9
#> 4       1     4       5
#> 5       2     5       7
#> 6       3     6       9
#> 
#> # Use as.data.table()/as.data.frame()/as_tibble() to access results