Closed catalamarti closed 2 months ago
sparklyr
is having the same issues when printing to console, so far I've tracked it down to head()
producing the table.*
call. It looks like when dbplyr
pulls the first row, I guess to get the column names, that call triggers either a head()
call or some lower level call in that same vein, which in turn outputs the error
Here is the issue in sparklyr
: https://github.com/sparklyr/sparklyr/issues/3429
Reprex:
library(sparklyr)
packageVersion("dbplyr")
#> [1] '2.5.0'
sc <- spark_connect("local", version = "3.3.0")
tbl_mtcars <- copy_to(sc, mtcars)
dbplyr::remote_query(head(tbl_mtcars, 1))
#> <SQL> SELECT `mtcars`.`*`
#> FROM `mtcars`
#> LIMIT 1
@edgararuiz Are you saying that spark only supports *
when it's not qualified? I.e. this would work
SELECT `*`
FROM `mtcars`
LIMIT 1
but not this
SELECT `mtcars`.`*`
FROM `mtcars`
LIMIT 1
(the relevant change would then come from https://github.com/tidyverse/dbplyr/pull/1278)
@catalamarti I think your reprex boils down to
library(DBI)
library(CDMConnector)
library(dplyr)
library(duckdb)
con <- DBI::dbConnect(duckdb(), path = ":memory:")
test_data <- data.frame(person = 1L,
date_1 = as.Date("2001-01-01"))
db_test_data <- copy_to(con, test_data, overwrite = TRUE)
db_test_data <- db_test_data %>%
dplyr::mutate(date_2 = date_1 + years(1))
db_test_data %>%
mutate(date_with_prior_history_1 = CDMConnector::dateadd("date_2", 1, interval = "year"))
but if you replace the last line with
db_test_data %>%
mutate(date_with_prior_history_1 = !!CDMConnector::dateadd("date_2", 1, interval = "year"))
it works. This is similar to what you do in your examples. Would that work for you?
@hadley this only worked in dbplyr < 2.5.0 due to qualifying the function with the package, i.e. using CDMConnector::dateadd
instead of dateadd
.
hi @mgirlich so yes your approach works and this works for the usual use of the function. But in packages where I want to put several variables in the same mutate (to generate efficient sql) depending on external variables I can no longer use !!! function as it breaks. reprex:
library(DBI)
library(CDMConnector)
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
library(duckdb)
con <- DBI::dbConnect(duckdb(), path = ":memory:")
test_data <- data.frame(person = 1L,
date_1 = as.Date("2001-01-01"))
db_test_data <- copy_to(con, test_data, overwrite = TRUE)
db_test_data <- db_test_data %>%
dplyr::mutate(date_2 = date_1 + years(1))
# This works
db_test_data %>%
mutate(date_with_prior_history_1 = !!CDMConnector::dateadd("date_2", 1, interval = "year"))
#> # Source: SQL [1 x 4]
#> # Database: DuckDB v0.10.0 [martics@Windows 10 x64:R 4.2.3/:memory:]
#> person date_1 date_2 date_with_prior_history_1
#> <int> <date> <dttm> <dttm>
#> 1 1 2001-01-01 2002-01-01 00:00:00 2003-01-01 00:00:00
# This doesn't
x <- rlang::parse_exprs("CDMConnector::dateadd('date_2', 1, interval = 'year')") %>%
rlang::set_names(glue::glue("date_with_prior_history_1"))
db_test_data %>%
mutate(!!!x)
#> Error in `CDMConnector::dateadd()`:
#> ! No known SQL translation
# Use case
ph <- 1:3
x <- glue::glue("CDMConnector::dateadd('date_2', {ph}, interval = 'year')") %>%
rlang::parse_exprs() %>%
rlang::set_names(glue::glue("date_with_prior_history_{ph}"))
db_test_data %>%
mutate(!!!x)
#> Error in `CDMConnector::dateadd()`:
#> ! No known SQL translation
Created on 2024-03-22 with reprex v2.1.0
Sorry for not providing more context before
You can simply wrap the thing with local()
:
x <- rlang::parse_exprs("local(CDMConnector::dateadd('date_2', 1, interval = 'year'))") %>%
rlang::set_names(glue::glue("date_with_prior_history_1"))
db_test_data %>%
mutate(!!!x)
Thank you very much @mgirlich that works for us, we will amend the packages. FYI: @edward-burn @ablack3 @ilovemane @mimiyuchenguo ...
@mgirlich - Correct, the back ticks cause an issue. But at this time, I made a change in sparklyr
that prevents the failure. I matched sparklyr
's dbQuoteIdentifier()
to what the other methods do, which is to return x
as-is if x
is SQL class. This is probably why other backends did not have that issue.
@catalamarti many apologies for this screw up — I had spotted the problems in my revdep checks but I incorrectly attributed them to a problem that I thought I had fixed in CDMConnector. I know it's miserable to receive this sort of email from CRAN and I'm sorry my mistake has caused extra work for you 😞.
(And apologies for taking so long to respond to this thread; in hindsight it was clearly a bad idea to submit dbplyr to CRAN just before I left for two weeks vacation 😬)
Thank you very much @mgirlich @hadley for me we can close the issue.
hi @hadley @mgirlich
the last release of dbplyr (2.5.0) broke the following packages:
It is a difficult fix that requires to come up with a different approach and implement it in all the packages before they are kick out of cran. Unless dbplyr goes back to the prior behavior on time.
I will try to explain with a series of reprex:
Created on 2024-03-20 with reprex v2.0.2
dateadd it is a function that we have in CDMConnector to provide the translation to add a number to a date in different dbms: https://github.com/darwin-eu/CDMConnector/blob/105b9d3df7e3fb2e66a14928e51557c22e8881ca/R/dateadd.R#L25-L65
The long term plan is to contribute to dbplyr as @ablack3 did here https://github.com/tidyverse/dbplyr/pull/1357 so long term this utility functions are not needed. But for the moment while it is not implemented in all dbms we need a work around to make this work.