Closed kaneplusplus closed 2 years ago
I updated the codebase to only pass around readonly connections. The only exception is the .volatiles$memory
connection that is used a workspace for the query function and needs to remain open. Everything there is temporary, though. As a future reminder to myself, some dbConnect
signatures take a read_only
argument, but for duckdb this needs to be done at the driver level.
One remaining issue, though, is that in your example you are passing the connection directly to ctgov_create_data
. This will produce once of the warnings about the connection not begin closed when it's garbage collected (since you have no reference to the connection, there is no way to close it). There are a few options here:
ctgov_create_duckdb
could just create the database, but not return a connection. Then, we just pass the dbdir
location as a string to ctgov_create_data
. The latter can close the connection once it is finished. Since we are closing and reopening the connection to make it read-only, there is no efficiency lost here.ctgov_create_duckdb
in an environment, we can call reg.finalizer
on it to cleanup the connection.ctgov_create_data
can close whatever connection is given to it.Preferences? None are particularly difficult to code.
I tend to write code like the following, which writes the raw database to duckdb and then writes the derived database.
Does it make sense to create separate "write connections" for
ctgov_create_duckdb()
andctgov_create_data()
, which are explicitly closed after the write operation is complete? Other connections could then use readonly connections.I'm mostly asking because I know that if all connections to a database are read only. Read-locks run with almost no overhead (unless there a write is happening). Also, I'm not sure it's a good idea for users to be able to fish through
.volatiles
, and use the connection to make changes to the database.