Closed UchidaMizuki closed 1 year ago
Looks like a keyby
arg is getting added to the select()
call for some reason.
library(dplyr)
library(dtplyr)
iris %>%
lazy_dt() %>%
select(Species, Sepal.Length) %>%
mutate(Sepal.Length_mean = mean(Sepal.Length),
.by = Species)
#> Source: local data table [150 x 4]
#> Call: `_DT1`[, .(Species, Sepal.Length), keyby = .(Species)][, `:=`(Sepal.Length_mean = mean(Sepal.Length)),
#> by = .(Species)]
#>
#> Species Species Sepal.Length Sepal.Length_mean
#> <fct> <fct> <dbl> <dbl>
#> 1 setosa setosa 5.1 5.01
#> 2 setosa setosa 4.9 5.01
#> 3 setosa setosa 4.7 5.01
#> 4 setosa setosa 4.6 5.01
#> 5 setosa setosa 5 5.01
#> 6 setosa setosa 5.4 5.01
#> # ℹ 144 more rows
#>
#> # Use as.data.table()/as.data.frame()/as_tibble() to access results
All fixed - thanks for catching this.
# Install dev version
# pak::pak("tidyverse/dtplyr")
library(dtplyr)
library(dplyr, warn.conflicts = FALSE)
iris %>%
lazy_dt() %>%
select(Species, Sepal.Length) %>%
mutate(Sepal.Length_mean = mean(Sepal.Length),
.by = Species) %>%
collect() %>%
head()
#> # A tibble: 6 × 3
#> Species Sepal.Length Sepal.Length_mean
#> <fct> <dbl> <dbl>
#> 1 setosa 5.1 5.01
#> 2 setosa 4.9 5.01
#> 3 setosa 4.7 5.01
#> 4 setosa 4.6 5.01
#> 5 setosa 5 5.01
#> 6 setosa 5.4 5.01
Errors occur if
.by
is specified whenmutate()
after selecting columns. In the following example, a data frame with 3 columns (Species
,Sepal.Length
,Sepal.Length_mean
) should be output.The second reprex without
select()
does not cause an error.Created on 2023-07-27 with reprex v2.0.2
Created on 2023-07-27 with reprex v2.0.2