Closed fh-afrachioni closed 1 month ago
Just to note I also find this inconsistency on postgres
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
library(dbplyr)
#>
#> Attaching package: 'dbplyr'
#> The following objects are masked from 'package:dplyr':
#>
#> ident, sql
test_data <- data.frame(
person = 1L,
date_1 = as.Date("1980-01-01"),
date_2 = as.Date("2010-01-01")
)
con <- DBI::dbConnect(RPostgres::Postgres(),
dbname = Sys.getenv("CDM5_POSTGRESQL_DBNAME"),
host = Sys.getenv("CDM5_POSTGRESQL_HOST"),
user = Sys.getenv("CDM5_POSTGRESQL_USER"),
password = Sys.getenv("CDM5_POSTGRESQL_PASSWORD"))
test_data |>
mutate(days = difftime(date_1, date_2))
#> person date_1 date_2 days
#> 1 1 1980-01-01 2010-01-01 -10958 days
db_test_data <- copy_to(con,
test_data,
overwrite = TRUE)
db_test_data |>
mutate(days = difftime(date_1, date_2))
#> # Source: SQL [1 x 4]
#> # Database: postgres [ohdsi@pgsqltest.cqnqzwtn5s1q.us-east-1.rds.amazonaws.com:5432/vocabularyv5]
#> person date_1 date_2 days
#> <int> <date> <date> <int>
#> 1 1 1980-01-01 2010-01-01 10958
db_test_data |>
mutate(days = difftime(date_1, date_2)) |>
dplyr::show_query()
#> <SQL>
#> SELECT
#> "test_data".*,
#> (CAST("date_2" AS DATE) - CAST("date_1" AS DATE)) AS "days"
#> FROM "test_data"
Created on 2024-08-12 with reprex v2.1.0
On several backends,
difftime(a, b)
is translated toDATEDIFF(a, b)
, when it should beDATEDIFF(b, a)
. Counterintuitive but well documented:Notably, Spark appears to be implemented correctly, as it has arguments in the opposite order. Other backends seem to make use of the
-
operator correctly.Example:
leads to
And on a real live Snowflake backend,
leads to
cc @fh-mthomson