Open MarinPostma opened 10 months ago
Ah, I was sure we already implement a soft limit for transactions, to let them last longer if there's no contention. So that's something we should definitely add.
And as for checkpoints, we can make them play the same stealing game - transactions only have a soft timeout, but once a checkpoint is about to be triggered, it tries to steal the lock, and potentially roll back a transaction that already passed the soft timeout.
I think this is a good proposal and definitely a step into the right direction.
One caveat is that closing the sqlite connection after every request will increase the latency of the next request on the same connection. This can be solved though by not releasing the connection eagerly and only evict it from connections cache when we're short on resources.
I think the cost of re-opening a connection is completely absorbed by network latencies, but we'd need to measure that. There are other levers to improve connection creation latency on the long run as well
Issue with lazily closing is that if we return a baton, then it is assumed that it can be used withing the limits. Long lasting connections should be the exception, to reduces contention on the concurrency semaphore
Transactions are given a (configurable, default to 5s) timeslot to complete
Apologies for the sidetrack but how can users configure this timeout duration on turso or in libsql? We currently have a use-case where we're trying to write large amounts of data in a single transaction and it's constantly timing out.
hey @billythedummy, currently this option is not exposed in turso. But there is a workaround. Say you have a main table CREATE TABLE emails (addr)
to which you want to bulk import transactionally. What you can do is create a CREATE TABLE emails_tmp (addr)
, to which you import in batch. Once all the data is loaded in the temp table, you can do INSERT INTO emails SELECT * FROM emails_tmp
. That will copy all the data over to the main table, in a single transaction.
I propose changing how the hrana protocol handles interactive connections.
How it currently works
Hrana-over-HTTP has become the de facto protocol to interact with sqld. When a request is received by hrana, a Hrana
Stream
is created. ThisStream
owns an SQLite connection to execute queries sent to that stream. AStream
is identified with abaton.
When a stream is opened, thebaton
for that stream is returned to the client. The client can pass the baton to subsequent requests so that they are executed on the stream identified by that baton. When clients want to dispose of that stream, they emit aClose
command on that stream, which is closed, freeing the SQLite connection in the process. To avoid keeping streams open forever, a watchdog is associated with each stream if a client fails to complete it. If the watchdog fails to be kicked (by receiving a request on that stream), then, after a defined period (10s currently), the stream is expired, and the SQLite connection is released.Transactions also have a timeout mechanism that is separate from the stream. Transactions are given a (configurable, default to 5s) timeslot to complete. Failing to do so yields a timeout error. The lock-stealer ensures that the transaction lock is delivered to another waiting connection, if any.
Motivation
This change is motivated by multiple factors:
hrana stream expired
in prod, which leads to user errors. That is not desirable, and we must address that somehow.Solution
1) Default to non-interactive transaction: On the hrana side, only return keep a stream open if the connection is not in an auto-commit state (i.e., the connection is currently in a transaction state). In other words, if the connection is not in a transaction state after executing a query, the default behavior is to close it. In that case, no baton is returned, and the resources are released. On subsequent queries, a new connection will be opened. This solves the following issues:
It may be desired cases. I'll use interactive connections outside of a transaction, but it would be an opt-in rather than an opt-out as it is today.
2) Hard and soft transaction limits The current transaction timeout is a hard timeout. It is enforced in two ways:
We should increase the timeout to let transactions run longer if there is no contention. We will still need to limit transactions so as not to prevent checkpoints, but we can also let them run for as long as a stream is open if the connection is uncontended.
3) **Kick the watchdog*** In the CLI scenario, keeping a stream open for an extended period may be desirable. We can allow that by having the client periodically kick the stream watchdog. This may require exposing a
Noop
operation in the hrana protocol, whose sole purpose is to keep the stream open.