duckdblabs / duckplyr

A drop-in replacement for dplyr, powered by DuckDB for performance.
https://duckdblabs.github.io/duckplyr/
Other
222 stars 12 forks source link

duckplyr does not support lubridate #167

Open JosiahParry opened 2 months ago

JosiahParry commented 2 months ago

duckdb when used with dbplyr supports the use of lubridate. However duckplyr does not.

It would be nice for duckplyr to have this capability i suspect.

library(duckdb)
library(dplyr)
library(dbplyr)

con <- dbConnect(duckdb())

taxi <- tbl(con, "read_parquet('taxi-data-2019-partitioned/*/*.parquet')") 

db_month <- taxi |> 
  mutate(month = lubridate::month(pickup_datetime)) 

df_month <- duckplyr::duckplyr_df_from_parquet("taxi-data-2019-partitioned/*/*.parquet") |> 
  mutate(month = lubridate::month(pickup_datetime)) 
#> The duckplyr package is configured to fall back to dplyr when it encounters an
#> incompatibility. Fallback events can be collected and uploaded for analysis to
#> guide future development. By default, no data will be collected or uploaded.
#> ℹ A fallback situation just occurred. The following information would have been
#>   recorded:
#>   {"version":"0.3.2","message":"Unknown function:
#>   `month()`","name":"mutate","x":{"...1":"character","...2":"POSIXct/POSIXt","...3":"POSIXct/POSIXt","...4":"numeric","...5":"numeric","...6":"numeric","...7":"numeric","...8":"character","...9":"character","...10":"numeric","...11":"numeric","...12":"character","...13":"numeric","...14":"numeric","...15":"numeric","...16":"numeric","...17":"numeric","...18":"numeric","...19":"numeric","...20":"numeric","...21":"numeric","...22":"numeric","...23":"character","...24":"numeric"},"args":{"dots":{"...24":"...25::...24(...2)"},".by":"NULL",".keep":["all","used","unused","none"]}}
#> → Run `duckplyr::fallback_sitrep()` to review the current settings.
#> → Run `Sys.setenv(DUCKPLYR_FALLBACK_COLLECT = 1)` to enable fallback logging,
#>   and `Sys.setenv(DUCKPLYR_FALLBACK_VERBOSE = 1)` in addition to enable
#>   printing of fallback situations to the console.
#> → Run `duckplyr::fallback_review()` to review the available reports, and
#>   `duckplyr::fallback_upload()` to upload them.
#> ℹ See `?duckplyr::fallback()` for details.
#> ℹ This message will be displayed once every eight hours.
#> materializing:
#> ---------------------
#> --- Relation Tree ---
#> ---------------------
#> read_parquet(taxi-data-2019-partitioned/*/*.parquet)
#> 
#> ---------------------
#> -- Result Columns  --
#> ---------------------
#> - vendor_name (VARCHAR)
#> - pickup_datetime (TIMESTAMP)
#> - dropoff_datetime (TIMESTAMP)
#> - passenger_count (BIGINT)
#> - trip_distance (DOUBLE)
#> - pickup_longitude (DOUBLE)
#> - pickup_latitude (DOUBLE)
#> - rate_code (VARCHAR)
#> - store_and_fwd (VARCHAR)
#> - dropoff_longitude (DOUBLE)
#> - dropoff_latitude (DOUBLE)
#> - payment_type (VARCHAR)
#> - fare_amount (DOUBLE)
#> - extra (DOUBLE)
#> - mta_tax (DOUBLE)
#> - tip_amount (DOUBLE)
#> - tolls_amount (DOUBLE)
#> - total_amount (DOUBLE)
#> - improvement_surcharge (DOUBLE)
#> - congestion_surcharge (DOUBLE)
#> - pickup_location_id (BIGINT)
#> - dropoff_location_id (BIGINT)
#> - year (VARCHAR)
#> - month (BIGINT)
krlmlr commented 2 weeks ago

Yeah, lubridate::month() is not yet on the list of supported functions: https://github.com/duckdblabs/duckplyr/pull/179/files#diff-a202cfba76540d6822868ac7755edd4945b6344057d78e0092f4836e33c0d4eaR11

krlmlr commented 1 week ago

There's now a contributing guide: https://duckdblabs.github.io/duckplyr/CONTRIBUTING.html#new-translations-for-functions .