tidyverse / duckplyr

A drop-in replacement for dplyr, powered by DuckDB for performance.
https://duckplyr.tidyverse.org/
Other
270 stars 17 forks source link

sum(..., na.rm = TRUE) creates a fall_back #205

Open meersel opened 2 months ago

meersel commented 2 months ago

Using the base sum() function with argument na.rm = TRUE creates an fall_back to dplyr. The argument na.rm is not recognized.

Example code:

duckplyr::as_duckplyr_tibble(my_tbl) |>

summarise( total_sales = sum(sales_col, na.rm = TRUE),

.by = c(country, item_number)

)

hadley commented 2 months ago

Rerprex:

library(dplyr, warn.conflicts = FALSE)
library(duckplyr, warn.conflicts = FALSE)
#> The duckplyr package is configured to fall back to dplyr when it encounters an
#> incompatibility. Fallback events can be collected and uploaded for analysis to
#> guide future development. By default, no data will be collected or uploaded.
#> → Run `duckplyr::fallback_sitrep()` to review the current settings.
#> ✔ Overwriting dplyr methods with duckplyr methods.
#> ℹ Turn off with `duckplyr::methods_restore()`.

mtcars |> 
  duckplyr::as_duckplyr_tibble() |> 
  summarise(wt = sum(wt, na.rm = TRUE))
#> The duckplyr package is configured to fall back to dplyr when it encounters an
#> incompatibility. Fallback events can be collected and uploaded for analysis to
#> guide future development. By default, no data will be collected or uploaded.
#> ℹ A fallback situation just occurred. The following information would have been
#>   recorded:
#>   {"version":"0.4.1","message":"Can't translate named argument `sum(na.rm =
#>   )`.","name":"summarise","x":{"...1":"numeric","...2":"numeric","...3":"numeric","...4":"numeric","...5":"numeric","...6":"numeric","...7":"numeric","...8":"numeric","...9":"numeric","...10":"numeric","...11":"numeric"},"args":{"dots":{"...6":"sum(...6,
#>   na.rm = TRUE)"}}}
#> → Run `duckplyr::fallback_sitrep()` to review the current settings.
#> → Run `Sys.setenv(DUCKPLYR_FALLBACK_COLLECT = 1)` to enable fallback logging,
#>   and `Sys.setenv(DUCKPLYR_FALLBACK_VERBOSE = TRUE)` in addition to enable
#>   printing of fallback situations to the console.
#> → Run `duckplyr::fallback_review()` to review the available reports, and
#>   `duckplyr::fallback_upload()` to upload them.
#> ℹ See `?duckplyr::fallback()` for details.
#> ℹ This message will be displayed once every eight hours.
#> # A tibble: 1 × 1
#>      wt
#>   <dbl>
#> 1  103.

Created on 2024-07-25 with reprex v2.1.0

krlmlr commented 2 days ago

Also for na.rm = FALSE and for multiple arguments:

options(conflicts.policy = list(warn = FALSE))

library(dplyr)
library(duckplyr)
#> The duckplyr package is configured to fall back to dplyr when it encounters an
#> incompatibility. Fallback events can be collected and uploaded for analysis to
#> guide future development. By default, no data will be collected or uploaded.
#> → Run `duckplyr::fallback_sitrep()` to review the current settings.
#> ✔ Overwriting dplyr methods with duckplyr methods.
#> ℹ Turn off with `duckplyr::methods_restore()`.

Sys.setenv(DUCKPLYR_FORCE = TRUE)

tibble(a = 1:3) |>
  duckplyr::as_duckplyr_tibble() |>
  summarise(sum(a, na.rm = TRUE))
#> Error in `rel_translate_lang()` at duckplyr/R/translate.R:321:5:
#> ! Can't translate named argument `sum(na.rm = )`.

tibble(a = 1:3) |>
  duckplyr::as_duckplyr_tibble() |>
  summarise(sum(a, na.rm = FALSE))
#> Error in `rel_translate_lang()` at duckplyr/R/translate.R:321:5:
#> ! Can't translate named argument `sum(na.rm = )`.

tibble(a = 1:3, b = 1:3) |>
  duckplyr::as_duckplyr_tibble() |>
  summarise(sum(a, b))
#> Error: {"exception_type":"Binder","exception_message":"No function matches the given name and argument types 'sum(INTEGER, INTEGER)'. You might need to add explicit type casts.\n\tCandidate functions:\n\tsum(DECIMAL) -> DECIMAL\n\tsum(SMALLINT) -> HUGEINT\n\tsum(INTEGER) -> HUGEINT\n\tsum(BIGINT) -> HUGEINT\n\tsum(HUGEINT) -> HUGEINT\n\tsum(DOUBLE) -> DOUBLE\n","name":"sum","candidates":"sum(DECIMAL) -> DECIMAL,sum(SMALLINT) -> HUGEINT,sum(INTEGER) -> HUGEINT,sum(BIGINT) -> HUGEINT,sum(HUGEINT) -> HUGEINT,sum(DOUBLE) -> DOUBLE","call":"sum(INTEGER, INTEGER)","error_subtype":"NO_MATCHING_FUNCTION"}

Created on 2024-10-16 with reprex v2.1.1