tidyverse / dbplyr

Database (DBI) backend for dplyr
https://dbplyr.tidyverse.org
Other
473 stars 171 forks source link

function translations for DATEDIFF and DATEADD #1080

Open ablack3 opened 1 year ago

ablack3 commented 1 year ago

Feature Request

DATEDIFF and DATEADD are commonly used SQL functions with slightly different syntax on different database platforms. I would like to use dbplyr's translation approach to generate SQL statements DATEDIFF and DATEADD that are correct for various database platforms.

I can use sql passthrough but then the dplyr code won't be translated and work on multiple database platforms. I would like to write one dplyr expression to calculate the difference between dates (or addition of an interval to a date) that will be translated correctly on both sql server and postgres. Below is a reprex that generates correct SQL on postgres but not sql server.

library(tidyverse)
df <- tibble(date1 = as.Date("2020-02-02"), date2 = as.Date("2020-02-03"))
db <- dbplyr::lazy_frame(df, con = dbplyr::simulate_postgres())

# Date add
db %>% 
  mutate(date3 = date1 + lubridate::years(1)) %>% 
  show_query()
#> <SQL>
#> SELECT *, `date1` + CAST('1 years' AS INTERVAL) AS `date3`
#> FROM `df`

# Date diff
db %>% 
  mutate(days = date2 - date1) %>% 
  show_query()
#> <SQL>
#> SELECT *, `date2` - `date1` AS `days`
#> FROM `df`

db <- dbplyr::lazy_frame(df, con = dbplyr::simulate_mssql())

# Date add
db %>% 
  mutate(date3 = date1 + lubridate::years(1)) %>% 
  show_query()
#> <SQL>
#> SELECT *, `date1` + years(1.0) AS `date3`
#> FROM `df`

# Date diff
db %>% 
  mutate(days = date2 - date1) %>% 
  show_query()
#> <SQL>
#> SELECT *, `date2` - `date1` AS `days`
#> FROM `df`

Created on 2022-12-15 with reprex v2.0.2

mgirlich commented 1 year ago

dbplyr doesn't know the column types, so it is not possible to simply translate + and - correctly for dates. Therefore, a custom function like date_add() and date_diff() would be necessary. Unfortunately, this would probably feel a bit artificial as this isn't required in R. Also, this would be quite hard to find. So, I'm not sure it is worthing adding such a function.

@hadley What's your opinion on this?

ablack3 commented 1 year ago

What about translating the R function difftime to DATEDIFF?

Is it possible/recommended for me to extend dbplyr function translation and add date_add and date_diff R functions in my own package with SQL translations by creating a custom sql translation environment?

hadley commented 1 year ago

Yeah, I can't see an easy way for dbplyr to support this currently — we don't know the column types so we can't make +/- work and we don't have an extension mechanism so that another package could define date_add() and date_diff().

ablack3 commented 1 year ago

I'm going to reopen as we explore the possibility of implementing translations for the clock functions. I've opened an initial PR here. Thanks for your help with this.