r-dbi / RPostgres

A DBI-compliant interface to PostgreSQL
https://rpostgres.r-dbi.org
Other
329 stars 78 forks source link

Breaking change to dbQuoteLiteral with NA in 1.4.2 #393

Closed ankane closed 1 year ago

ankane commented 2 years ago

Hey @krlmlr, it looks like #370 causes issues for queries that rely on implicit conversion (https://github.com/ankane/dbx/issues/30).

library(DBI)

db <- dbConnect(RPostgres::Postgres(), dbname="rpostgres_repro")
dbExecute(db, "DROP TABLE IF EXISTS t")
#> [1] 0
dbExecute(db, "CREATE TABLE t (col jsonb)")
#> [1] 0

json <- c("{}", NA)
values_expr <- paste(paste0("(", dbQuoteLiteral(db, json), ")"), collapse=", ")
stmt <- paste("INSERT INTO t (col) VALUES", values_expr)
print(stmt)
#> [1] "INSERT INTO t (col) VALUES ('{}'), (NULL::text)"

dbExecute(db, stmt)
#> Error: Failed to prepare query: ERROR:  column "col" is of type jsonb but expression is of type text
#> LINE 1: INSERT INTO t (col) VALUES ('{}'), (NULL::text)
#>                                             ^
#> HINT:  You will need to rewrite or cast the expression.

The script succeeds with 1.4.1. Is it possible to revert the change?

krlmlr commented 2 years ago

Thanks, confirmed. Would it help if we emitted NULL instead of NULL::text only for string values?

ankane commented 2 years ago

Yeah, I think that should solve it.

mgirlich commented 2 years ago

Imho in this case the correct thing to do would be to have the json variable to actually be of class json and provide a dbQuoteLiteral() method for json.

Alternatively (or maybe have that anyway) add an argument implicit to dbQuoteLiteral()?

tiagocabaco commented 1 year ago

hey, are there any new updates regarding this issue? thank you :)

tiagocabaco commented 1 year ago

Would it make sense to extend on https://github.com/r-dbi/RPostgres/pull/370/ and reverse the change applied here https://github.com/r-dbi/RPostgres/pull/370/commits/b6d97c1f26f293f715c49ec895b23d7c8abc5c64#diff-49690042645df85dc9ae7e18aec05c8dd787da2c81d51128d9a2683dacc2dde5R30?

Or would you prefer to extent on the type handling by adding one for json fields?

ankane commented 1 year ago

fwiw, this affects all types (jsonb was just an example).

Kjir commented 1 year ago

This also affects me. The following code won't call the correct stored procedure because of the type cast on the NULL:

  if(is.null(valid_on)) {
    valid_on <- NA
  }

res <- dbSendQuery(con, sprintf("select * from %sts_read_raw(%s, %s)",
  dbQuoteIdentifier(con, Id(schema = schema)),
  dbQuoteLiteral(con, valid_on),
  dbQuoteLiteral(con, respect_release_date)))

I get the error:

Error: Failed to prepare query: ERROR:  function timeseries.ts_read_raw(boolean, boolean) does not exist
LINE 1: select * from "timeseries".ts_read_raw(NULL::bool, FALSE)

Downgrading to 1.4.1 fixes the issue.

Might I suggest to not cast NULL to bool? If assigning a literal NA in code it will always default to a logical, even though the variable might be intended to hold a different type of data. Programmers would have to very explicitly set the type, which is not common in R, like so:

  if(is.null(valid_on)) {
    valid_on <- as.Date(NA)
  }

Expecting this to be the case for all libraries and existing R code is unrealistic and what happens instead is that there are unexpected errors in code that used to work perfectly before.

So my proposal is that if it is not possible to define a more restricted type (with all the checks already in place), the fallback is to translate NA into NULL without cast

mbannert commented 1 year ago

@krlmlr I can confirm this is interesting, what do you think about @Kjir suggestion?

dpprdan commented 1 year ago

IIUC Postgres (sometimes) automatically casts untyped NULLs (i.e. those of unknown type) to text, e.g. in a VALUE statement. This is leads to problems with non-text-ish types, see this answer on SO.

OTOH explicitly casting NULLs of text-ish types (e.g. json or enum) to text is also problematic, as we can see here.

(I originally ran into this dbQuoteLiteral() issue while trying to dbx::dbxUpsert() an enum with NULL values.)

Another example where Postgres seems to require an explicitly cast NULL on a non-text-ish type, namely a function that expects a date (or timestamp or interval):

library(RPostgres)
con <- postgresDefault()

dbGetQuery(con, "SELECT date_part('year', NULL);")
#> Error: Failed to prepare query: ERROR:  function date_part(unknown, unknown) is not unique
#> LINE 1: SELECT date_part('year', NULL);
#>                ^
#> HINT:  Could not choose a best candidate function. You might need to add explicit type casts.
dbGetQuery(con, "SELECT date_part('year', NULL::date );")
#>   date_part
#> 1        NA

While for a function that requires a json this seems to be the other way around.

dbGetQuery(con, "SELECT jsonb_pretty(NULL);")
#>   jsonb_pretty
#> 1         <NA>
dbGetQuery(con, "SELECT jsonb_pretty(NULL::text);")
#> Error: Failed to prepare query: ERROR:  function jsonb_pretty(text) does not exist
#> LINE 1: SELECT jsonb_pretty(NULL::text);
#>                ^
#> HINT:  No function matches the given name and argument types. You might need to add explicit type casts.

As an aside: It gets even more 😡 with enum support functions.

Notice that except for the two-argument form of enum_range, these functions disregard the specific value passed to them; they care only about its declared data type. Either null or a specific value of the type can be passed, with the same result. It is more common to apply these functions to a table column or function argument than to a hardwired type name as used in the examples. PostgreSQL: Documentation: 15: 9.10.Β Enum Support Functions

Long story short: @krlmlr’s proposal, i.e.Β not type-casting NULLs only for string values, seems the way to go. (Though again with the important AFAIU caveat, i.e.Β I may be missing an important nuance.)

Session info ``` r dbGetQuery(con, "SELECT version(); ") #> version #> 1 PostgreSQL 15.1 (Debian 15.1-1.pgdg110+1) on x86_64-pc-linux-gnu, compiled by gcc (Debian 10.2.1-6) 10.2.1 20210110, 64-bit ``` ``` r sessioninfo::session_info() #> ─ Session info ─────────────────────────────────────────────────────────────── #> setting value #> version R version 4.2.2 (2022-10-31 ucrt) #> os Windows 10 x64 (build 19044) #> system x86_64, mingw32 #> ui RTerm #> language en #> collate German_Germany.utf8 #> ctype German_Germany.utf8 #> tz Europe/Berlin #> date 2023-02-28 #> pandoc 2.19.2 @ C:/Program Files/RStudio/resources/app/bin/quarto/bin/tools/ (via rmarkdown) #> #> ─ Packages ─────────────────────────────────────────────────────────────────── #> package * version date (UTC) lib source #> bit 4.0.5 2022-11-15 [1] CRAN (R 4.2.2) #> bit64 4.0.5 2020-08-30 [1] CRAN (R 4.2.0) #> blob 1.2.3 2022-04-10 [1] CRAN (R 4.2.0) #> cli 3.6.0 2023-01-09 [1] CRAN (R 4.2.2) #> DBI 1.1.3 2022-06-18 [1] CRAN (R 4.2.0) #> digest 0.6.31 2022-12-11 [1] CRAN (R 4.2.2) #> ellipsis 0.3.2 2021-04-29 [1] CRAN (R 4.2.0) #> evaluate 0.20 2023-01-17 [1] CRAN (R 4.2.2) #> fastmap 1.1.0 2021-01-25 [1] CRAN (R 4.2.0) #> fs 1.6.1 2023-02-06 [1] CRAN (R 4.2.2) #> generics 0.1.3 2022-07-05 [1] CRAN (R 4.2.1) #> glue 1.6.2.9000 2023-01-16 [1] Github (tidyverse/glue@5a16502) #> hms 1.1.2 2022-08-19 [1] CRAN (R 4.2.1) #> htmltools 0.5.4 2022-12-07 [1] CRAN (R 4.2.2) #> knitr 1.42 2023-01-25 [1] CRAN (R 4.2.2) #> lifecycle 1.0.3 2022-10-07 [1] RSPM #> lubridate 1.9.2 2023-02-10 [1] CRAN (R 4.2.2) #> magrittr 2.0.3 2022-03-30 [1] CRAN (R 4.2.0) #> pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 4.2.0) #> purrr 1.0.1 2023-01-10 [1] CRAN (R 4.2.2) #> R.cache 0.16.0 2022-07-21 [1] CRAN (R 4.2.1) #> R.methodsS3 1.8.2 2022-06-13 [1] CRAN (R 4.2.0) #> R.oo 1.25.0 2022-06-12 [1] CRAN (R 4.2.0) #> R.utils 2.12.2 2022-11-11 [1] CRAN (R 4.2.2) #> Rcpp 1.0.10 2023-01-22 [1] CRAN (R 4.2.2) #> reprex 2.0.2 2022-08-17 [1] CRAN (R 4.2.1) #> rlang 1.0.6 2022-09-24 [1] CRAN (R 4.2.1) #> rmarkdown 2.20 2023-01-19 [1] CRAN (R 4.2.2) #> RPostgres * 1.4.5 2023-01-20 [1] CRAN (R 4.2.2) #> rstudioapi 0.14 2022-08-22 [1] CRAN (R 4.2.1) #> sessioninfo 1.2.2 2021-12-06 [1] CRAN (R 4.2.0) #> styler 1.9.0 2023-01-15 [1] CRAN (R 4.2.2) #> timechange 0.2.0 2023-01-11 [1] CRAN (R 4.2.2) #> vctrs 0.5.2 2023-01-23 [1] CRAN (R 4.2.2) #> withr 2.5.0 2022-03-03 [1] CRAN (R 4.2.0) #> xfun 0.37 2023-01-31 [1] CRAN (R 4.2.2) #> yaml 2.3.7 2023-01-23 [1] CRAN (R 4.2.2) #> #> [1] C:/Users/Daniel/AppData/Local/R/win-library/4.2 #> [2] C:/Program Files/R/R-4.2.2/library #> #> ────────────────────────────────────────────────────────────────────────────── ```
krlmlr commented 1 year ago

With #425, the motivating example works.

ankane commented 1 year ago

Thanks @krlmlr