r-dbi / dbi3

DBI revisited
https://r-dbi.github.io/dbi3
37 stars 2 forks source link

Streaming multiple statements in a single dbExecute/dbQuery/etc. call #56

Open mmuurr opened 2 years ago

mmuurr commented 2 years ago

This was mentioned a few years back in r-dbi/RPostgres#152 and I simply failed to actually start the conversation over here. It'd be very useful to be able to place multiple SQL statements in a single file then send that to the DB backend. I've had cryptic failures with both RMariaDB and RPostgres when wanting to do something akin to:

DBI::dbExecute(conn, readr::read_file("my-sql-script.sql"))

... and perhaps mixed-in with other DBI utilities and single-statement constructs, like:

DBI::dbWithTransaction(conn, {
  DBI::dbExecute(conn, readr::read_file("my-sql-script.sql"))  ## <-- as an example, this multi-statement script may create _my_temp_table
  DBI::dbGetQuery(conn, "select * from _my_temp_table")  ## <-- single statement
  ## some other application code
})

As a further-motivating case, consider where "my-sql-script.sql" is a {glue} template, so:

DBI::dbExecute(conn, glue::glue_sql(readr::read_file("my-sql-script.sql"), .con = conn))

The {glue} example helps address the 'what to do for parameterized statements' question: in essence, nothing ... in fact if there was something like dbExecuteBatch() that simply didn't permit driver-managed parameterization, I think that'd be great. Any interpolation needed can be handled by {glue} or other string-based methods (with all the associated buyer-beware injection-attack caveats).

With the current single-statement implementations, one can rip apart a script into multiple statements, but many SQL scripts are quite long (at least in my case). Ripping apart the script into lots of separate files thus isn't a great option, nor is littering the R code with a lot of SQL string literals, especially when the SQL file is useful on its own (i.e. as a standalone script to be used by other programs or users).

One could try to parse the file and separate out the individual statements, then iterate over them, like:

readr::read_file("my-sql-script.sql") %>%
  sqlparser::parse_into_separate_statements() %>%  ## <-- an imaginary package & function
  purrr::walk(function(x) DBI::dbExecute(conn, x))

... but that requires an actual SQL-parsing step, which seems best left to the SQL driver/engine itself.

As noted in r-dbi/RPostgres#152, PostgreSQL supports this via the use of PQexec() (vs PQprepare() for single statements).

krlmlr commented 2 years ago

Thanks for raising this. We should consider offering a way to run multi-statement nonparameterized scripts.