tomoakin / RPostgreSQL

Automatically exported from code.google.com/p/rpostgresql
64 stars 20 forks source link

dplyr::copy_to(overwrite = TRUE) causes PostgreSQL to issue a spurious warning #94

Closed JohnMount closed 6 years ago

JohnMount commented 6 years ago

This is possibly a dplyr issue (also entered it as dplyr issue 3197), but if you are willing to research the following it could help some RPostgreSQL users.

dplyr::copy_to(overwrite = TRUE) causes PostgreSQL to issue a spurious warning. When deep in larger scripts are procedures this has been causing users to question the correctness of results as they do not have enough context to know which "tables did not exist" and what effect that has had on a larger process. In context it looks like an unconditional drop was issued, perhaps there is a conditional drop that will not trigger this issue?

Below is a reprex showing the effect (using the dev version of reprex to capture the message).

reprex::reprex_info()
#> Created by the reprex package v0.1.1.9000 on 2017-11-08

# https://github.com/tidyverse/reprex/issues/90
# devtools::install_github("tidyverse/reprex")
# reprex::reprex(std_out_err = TRUE)
suppressPackageStartupMessages(library(dplyr))
packageVersion("dplyr")
#> [1] '0.7.4'
library("RPostgreSQL")
#> Loading required package: DBI
packageVersion("RPostgreSQL")
#> [1] '0.6.2'
my_db <- DBI::dbConnect(dbDriver("PostgreSQL"), 
                        host = 'localhost',
                        port = 5432,
                        user = 'postgres',
                        password = 'pg')
d <- copy_to(my_db, data.frame(x = 1), 
             name = 'd',
             overwrite = TRUE, temporary=TRUE)

standard output and standard error

#> NOTICE:  table "d" does not exist, skipping
tomoakin commented 6 years ago

DROP TABLE IF EXISTS is expected to issue a notice if the table does not exist. https://www.postgresql.org/docs/9.6/static/sql-droptable.html

You can check the existence of the table by dbExistsTable() before issuing DROP TABLE sql. This is done in dbWriteTable(...,overwrite = TRUE).

With the call sequence

https://github.com/tidyverse/dbplyr/blob/master/R/db-compute.R

if (overwrite) {
  db_drop_table(con, table, force = TRUE)
}

https://github.com/tidyverse/dbplyr/blob/master/R/dbi-s3.r

db_drop_table.DBIConnection <- function(con, table, force = FALSE, ...) {
 sql <- build_sql(
   "DROP TABLE ", if (force) sql("IF EXISTS "), as.sql(table),
    con = con
  )
  dbExecute(con, sql)
}

The NOTICE is just as spec.

I am not sure why getting a message "NOTICE: table "d" does not exist, skipping" is considered spurious when you call dplyr::copy_to(name = 'd', overwrite = TRUE) when the table 'd' does not exist. This just depends on the design of copy_to.

The existence of table may be checked at db_drop_table(force = TRUE). Another option is before calling db_drop_table(). The decision may depend on other databases that is handled by those functions.