r-dbi / dbi3

DBI revisited
https://r-dbi.github.io/dbi3
37 stars 2 forks source link

`dbIsValid()` fails in new sessions #14

Open mllg opened 4 years ago

mllg commented 4 years ago

If dbIsValid() is called in a fresh R session, the required packages are not loaded correctly.

path_db = tempfile()
con = DBI::dbConnect(RSQLite::SQLite(), path_db)

# break the connection/external pointers by saving
# to file system and restoring again
path = tempfile()
saveRDS(con, file = path)
con = readRDS(path)

# detection works in current session
DBI::dbIsValid(con)

# detection fails in remote session, RSQLite is not loaded
callr::r(DBI::dbIsValid, list(con = con))

# RSQLite is loaded on print
callr::r(function(con) { print(con); DBI::dbIsValid(con) }, list(con = con))

Error message:

 unable to find an inherited method for function ‘dbIsValid’ for signature
 ‘"missing"’>
krlmlr commented 4 years ago

I think we need callr::r(function() ...):

path_db = tempfile()
con = DBI::dbConnect(RSQLite::SQLite(), path_db)

# break the connection/external pointers by saving
# to file system and restoring again
path = tempfile()
saveRDS(con, file = path)
con = readRDS(path)

# detection works in current session
DBI::dbIsValid(con)
#> [1] FALSE

# detection fails in remote session, RSQLite is not loaded
callr::r(function(con) DBI::dbIsValid(con), list(con = con))
#> Error: callr subprocess failed: unable to find an inherited method for function ‘dbIsValid’ for signature ‘"SQLiteConnection"’

# RSQLite is loaded on print
callr::r(function(con) { print(con); DBI::dbIsValid(con) }, list(con = con))
#> [1] FALSE

Created on 2020-04-14 by the reprex package (v0.3.0)

I'm not too familiar with S4. What do we need to do to fix this?

krlmlr commented 3 years ago

Oh, I misread. I don't think you can send con objects to other R sessions, the involved external pointers won't survive serialization.

mllg commented 3 years ago

Oh, I misread. I don't think you can send con objects to other R sessions, the involved external pointers won't survive serialization.

Sure, but shouldn't dbIsValid() return FALSE if the external pointer is NULL (R_ExternalPtrAddr(s) == NULL)?

krlmlr commented 3 years ago

Yeah, I would have expected that deserialization of con loads RSQLite, this doesn't seem to happen. This feels more like a callr issue though.

life-droid commented 3 years ago

I have come across this issue too (I run a check function prior to executing a SQL call), and I am wondering how you imagined it could be solved, I agree that it should return false in this instance.

krlmlr commented 3 years ago

@life-droid Are you sure it's the same issue? Are you also running callr?

mllg commented 3 years ago

@life-droid Are you sure it's the same issue? Are you also running callr?

Note that you have the same problem during parallelization, e.g. with socket clusters.

krlmlr commented 3 years ago

So, every time a connection object is passed across process boundaries? Can we reprex this without using callr?

life-droid commented 3 years ago

@life-droid Are you sure it's the same issue? Are you also running callr?

Note that you have the same problem during parallelization, e.g. with socket clusters.

Sorry guys, I think my novice understanding of your issue is causing some confusion and I may be muddying the waters.

My issue is that if I had a connection to a DB, then reopen a project (or in the case I had at the time, rebuild a package) the object is still present with type "S4" but calling dbIsValid(dbo) returns the error message mentioned rather than FALSE.

unable to find an inherited method for function ‘dbIsValid’ for signature ‘"missing"'

life-droid commented 3 years ago

@life-droid Are you sure it's the same issue? Are you also running callr?

Not sure, it was occuring after a package build, the connection to my dbo was broken and I want to add some error proofing so I use dbIsValid(dbo) to determine if I need to run my function to connect to the dbo.

krlmlr commented 3 years ago

Does "reopen a project" mean loading a saved session state? This seems equivalent to serializing. It should work if you load the RSQLite package first, I wonder why this is necessary.

mllg commented 3 years ago

So, every time a connection object is passed across process boundaries? Can we reprex this without using callr?

path_db = tempfile()
con = DBI::dbConnect(RSQLite::SQLite(), path_db)
cl = parallel::makePSOCKcluster("localhost")
parallel::clusterApply(cl, con, DBI::dbIsValid)
krlmlr commented 3 years ago
# library(RSQLite)
#
# con <- dbConnect(RSQLite::SQLite())
# string <- rawToChar(serialize(con, NULL, ascii = TRUE))
# dput(string)

string <- "A\n3\n262147\n197888\n5\nUTF-8\n66329\n1026\n1\n262153\n3\nptr\n22\n254\n254\n1026\n1\n262153\n6\ndbname\n16\n1\n262153\n0\n\n1026\n1\n262153\n19\nloadable.extensions\n10\n1\n1\n1026\n1\n262153\n5\nflags\n13\n1\n70\n1026\n1\n262153\n3\nvfs\n16\n1\n262153\n0\n\n1026\n1\n262153\n3\nref\n4\n0\n242\n254\n19\n29\n254\n254\n254\n254\n254\n254\n254\n254\n254\n254\n254\n254\n254\n254\n254\n254\n254\n254\n254\n1026\n1\n262153\n6\nresult\n254\n254\n254\n254\n254\n254\n254\n254\n254\n254\n254\n254\n1026\n1\n262153\n6\nbigint\n16\n1\n262153\n9\ninteger64\n1026\n1\n262153\n5\nclass\n528\n1\n262153\n16\nSQLiteConnection\n1026\n1\n262153\n7\npackage\n16\n1\n262153\n7\nRSQLite\n254\n254\n"
con <- unserialize(charToRaw(string))
DBI::dbIsValid(con)
#> Loading required package: RSQLite
#> Error in (function (classes, fdef, mtable) : unable to find an inherited method for function 'dbIsValid' for signature '"SQLiteConnection"'

requireNamespace("RSQLite")
DBI::dbIsValid(con)
#> [1] FALSE

Created on 2021-01-19 by the reprex package (v0.3.0)

krlmlr commented 3 years ago

So, the workaround is to load the RSQLite package before attempting to call dbIsValid(). Let's see if we can do more about this problem; it might be too much of an edge case though.

life-droid commented 3 years ago

Does "reopen a project" mean loading a saved session state? This seems equivalent to serializing. It should work if you load the RSQLite package first, I wonder why this is necessary.

Correct, if I have a project with a functioning connection when I check if it is valid I get the expected result:

> DBI::dbIsValid(m1)
[1] TRUE

Closing the project then reopening it (saving the workspace on close), the connection object is still visible but has a null pointer. When I run DBI::dbIsValid(m1) in this state I get the following: null pointer

Error in (function (classes, fdef, mtable)  : 
  unable to find an inherited method for function ‘dbIsValid’ for signature ‘"Microsoft SQL Server"’
krlmlr commented 2 years ago

Let's consider serializable connections in a rewrite. (The connection object should know how to reconnect when the external pointer is missing.)