Open shcheklein opened 2 months ago
We should check why read-only iterator locks the data for other data chains (it fails on creating tables for UDF for example, or removing tables on cleanup - so, on updating system tables on SQLite). Can we make these read-only stuff a non-blocker in the first place
It doesn't seem like any SQLite connections are opened in read-only mode. They all take both the read and write locks.
I had a look at introducing a read
method in the SQLiteDatabaseEngine
that would look something like this:
@retry_sqlite_locks
def read(self, query):
with sqlite3.connect(
"file::memory:?mode=ro&nolock=1&cache=shared",
uri=True,
detect_types=DETECT_TYPES,
) as connection:
return connection.execute(*self.compile_to_args(query))
Unfortunately, that doesn't seem to work (at least running as a test), perhaps because you can't open the in-memory database in read-only mode.
We need to make it a context manager (so that it close itself and release database underneath)
Doesn't this go against the decision that we made earlier to avoid using a session as a context manager?
Doesn't this go against the decision that we made earlier to avoid using a session as a context manager?
it's not about session, it's only about this Iterator returned by collect - it should be closable if we can't make DB work in a mode where there is no lock.
Action Items
Description
Related to https://github.com/iterative/datachain-examples/issues/15
Simple way to reproduce:
Causes something like (when you run it within a test suite at least, since it is causing DB cleanup, but the same can be imitated by let's say making the second part a bit more complicated - with UDF):
It's caused (I think) by us using:
It is an open connection / cursor staying underneath. It affects probably also other DataChains since we are reusing the same DB.
Workaround
Obvious workaround - use
limit
(+offset
if needed) to make sure that it's a single (or whatever number of elements is needed) and / orlist(...)
or iterate through them.