Closed etiennebr closed 6 years ago
Working with schemas is tricky in DBI, in general, but work is thankfully outlined on that front, at least (https://github.com/r-dbi/DBI/issues/24). The easiest way to workaround this upstream limitation is by defining the schema server-side.
Maybe your point is that this workaround shouldn't be necessary, but I think (hope) some of the schema finicky behavior will be taken care of when DBI has taken schemas into account.
library(odbc)
library(DBI)
con <- dbConnect(odbc::odbc(), ...)
dbWriteTable(con, "a", iris) ## ok
dbWriteTable(con, "a", iris) ## err
#> Error: Table a exists in database, and both overwrite and append are FALSE
dbWriteTable(con, "a", iris, overwrite = TRUE) ## ok
dbWriteTable(con, DBI::SQL("public.a"), iris, overwrite = TRUE) ## err
#> Note: method with signature 'DBIConnection#SQL' chosen for function 'dbQuoteIdentifier',
#> target signature 'PostgreSQL#SQL'.
#> "OdbcConnection#character" would also be valid
#> Error: <SQL> 'CREATE TABLE public.a (
#> "Sepal.Length" DOUBLE PRECISION,
#> "Sepal.Width" DOUBLE PRECISION,
#> "Petal.Length" DOUBLE PRECISION,
#> "Petal.Width" DOUBLE PRECISION,
#> "Species" TEXT
#> )
#> '
#> nanodbc/nanodbc.cpp:1587: 42P07: [RStudio][PostgreSQL] (30) Error occurred while trying to execute a query: [SQLState 42P07] ERROR: relation "a" already exists
#>
## working in a different schema
dbWriteTable(con, DBI::SQL("datawarehouse.a"), iris) ## ok
dbWriteTable(con, DBI::SQL("datawarehouse.a"), iris, overwrite = TRUE) ## err
#> Error: <SQL> 'CREATE TABLE datawarehouse.a (
#> "Sepal.Length" DOUBLE PRECISION,
#> "Sepal.Width" DOUBLE PRECISION,
#> "Petal.Length" DOUBLE PRECISION,
#> "Petal.Width" DOUBLE PRECISION,
#> "Species" TEXT
#> )
#> '
#> nanodbc/nanodbc.cpp:1587: 42P07: [RStudio][PostgreSQL] (30) Error occurred while trying to execute a query: [SQLState 42P07] ERROR: relation "a" already exists
#>
dbExecute(con, "SET search_path = datawarehouse")
#> [1] 0
dbWriteTable(con, "a", iris, overwrite = TRUE) ## ok
Wondering if maybe the dispatch of dbWriteTable
on a SQL
object doesn't include a DROP TABLE
if overwrite=TRUE
...
The only place that this really gets painful is cross-schema joins... but dbplyr
comes to the rescue there!!
library(dbplyr)
library(dplyr)
library(odbc)
library(DBI)
con <- dbConnect(odbc::odbc(), ...)
tbl_a <- tbl(con, in_schema("public", "a"))
tbl_b <- tbl(con, in_schema("datawarehouse", "a"))
tbl_a %>% left_join(tbl_b, by = c("Species")) %>% sql_render()
#> <SQL> SELECT "TBL_LEFT"."Sepal.Length" AS "Sepal.Length.x", "TBL_LEFT"."Sepal.Width" AS "Sepal.Width.x", "TBL_LEFT"."Petal.Length" AS "Petal.Length.x", "TBL_LEFT"."Petal.Width" AS "Petal.Width.x", "TBL_LEFT"."Species" AS "Species", "TBL_RIGHT"."Sepal.Length" AS "Sepal.Length.y", "TBL_RIGHT"."Sepal.Width" AS "Sepal.Width.y", "TBL_RIGHT"."Petal.Length" AS "Petal.Length.y", "TBL_RIGHT"."Petal.Width" AS "Petal.Width.y"
#> FROM public.a AS "TBL_LEFT"
#> LEFT JOIN datawarehouse.a AS "TBL_RIGHT"
#> ON ("TBL_LEFT"."Species" = "TBL_RIGHT"."Species")
So extending what I said, I think what you have uncovered @etiennebr is an inconsistency in the way that SQL interpretation happens wrt schemas. Again, something that will hopefully be taken into account and tested for as DBI adds schema support.
Also, I reiterate my suggestion to do the following until that is the case.
dbplyr
for in_schema()
referencesdbGetQuery
or dbSendQuery
Quick semantics - dbWriteTable
uses dbExistsTable
as a test internally when overwrite=TRUE
to see whether it should drop the existing table. The problem you are running into comes from dbExistsTable
and dbWriteTable
understanding schemas differently.
Check out this sequence:
library(odbc)
library(DBI)
con <- dbConnect(odbc::odbc(), ...)
# I got this problematic sequence from the original example above
dbWriteTable(con, DBI::SQL("a"), data.frame(a = 1, b = 2))
dbWriteTable(con, DBI::SQL("public.a"), data.frame(a = 1, b = 2))
#> Error: <SQL> 'CREATE TABLE public.a (
#> "a" DOUBLE PRECISION,
#> "b" DOUBLE PRECISION
#> )
#> '
#> nanodbc/nanodbc.cpp:1587: 42P07: [RStudio][PostgreSQL] (30) Error occurred while trying to execute a query: [SQLState 42P07] ERROR: relation "a" already exists
#>
dbWriteTable(con, DBI::SQL("public.a"), data.frame(a = 1, b = 2), overwrite = TRUE)
#> Error: <SQL> 'CREATE TABLE public.a (
#> "a" DOUBLE PRECISION,
#> "b" DOUBLE PRECISION
#> )
#> '
#> nanodbc/nanodbc.cpp:1587: 42P07: [RStudio][PostgreSQL] (30) Error occurred while trying to execute a query: [SQLState 42P07] ERROR: relation "a" already exists
#>
# this is why that happens
dbExistsTable(con, DBI::SQL("public.a"))
#> [1] FALSE
dbWriteTable(con, "public.a", data.frame(a = 1, b = 2))
dbExistsTable(con, DBI::SQL("public.a"))
#> [1] TRUE
And our schema looks like...
Essentially, when you say overwrite=TRUE
, dbExistsTable
comes along and says "no worries - this table is not defined yet." So "DROP TABLE" never happens. Then the "Create Table" fails because the table already exists. They're referencing different tables!
Oops. This is a duplicate of #91 . Same issue, I think.
Thanks @colearendt, you're absolutely right and your workaround using dbplyr
is ingenious! I thought the issue was specific to odbc
's options overwrite
and append
, but indeed it is generalized to DBI
and I know there are plans to fix schema management, hopefully soon.
Description
It seems that specifying a schema using
DBI::SQL()
prevents the use ofoverwrite
andappend
. Here's an example using thepublic
schema, but obviously it is when using other schemas that it is hard to work around.Database
PostgreSQL 9.3.20 64-bit
Example