cboettig / birddb

Import and query all of eBird locally
MIT License
10 stars 2 forks source link

Cache connection so checklist and observation data end up in same db #7

Closed mstrimas closed 3 years ago

mstrimas commented 3 years ago

I managed to borrow a Windows machine to look into that check error. It seems on Windows you can't create both the checklist and observation views in the same folder on disk, so I've just modified the test to store the views in memory. I also found explicitly disconnecting from the first database with DBI::dbDisconnect(con, shutdown = TRUE) before connecting to the second also works.

Obviously this isn't an ideal long term solution, more of a stop gap measure. Not sure what the right approach is here though, e.g. always store views in memory, create a new subdir of BIRDDB_DUCKDB for each new connection?

cboettig commented 3 years ago

Oh good catch, I hadn't actually paid enough attention here and didn't realize we created separate connections. We don't want to do that, it will make life hard down the road (i.e. you want to be able to do relational db operations both tables need to be on the same connection).

I was lazy in writing ebird_conn(). Generally I like connection functions to create the connection if it doesn't already exist and cache it in the package env, or if it does exist, the function should load the existing connection. (e.g. an example here: https://github.com/ropensci/arkdb/blob/master/R/local_db.R#L84). we can probably port the code over for that, so consecutive calls to ebird_conn() load the cached connection.

mstrimas commented 3 years ago

Ah, of course, that makes a lot of sense. I'll try to implement this over the next few days. Thanks!

cboettig commented 3 years ago

okay cool, keep me posted.

mstrimas commented 3 years ago

@cboettig this should now properly cache the connection and load the two datasets into the same database