canonical / dqlite

Embeddable, replicated and fault-tolerant SQL engine.
https://dqlite.io
Other
3.89k stars 218 forks source link

Opt-in querying of non-leaders #709

Open cole-miller opened 1 month ago

cole-miller commented 1 month ago

This PR is a first stab at allowing non-leader nodes to serve PREPARE, QUERY, and QUERY_SQL requests. When served by a non-leader, such requests will in general return stale data (QUERY, QUERY_SQL), or fail unhelpfully (PREPARE and QUERY_SQL, see #395). But when these limitations are tolerable, the option to query a non-leader can be helpful to spread load or (when targeting the local node) avoid network round-trips.

Clients must opt into relaxing the current leader checks on a per-connection basis. This is done by repurposing the old, unused flags field on the OPEN request (which non-leaders already accept). We relax the leader checks when the least-significant bit is set in flags; this is backwards-compatible since extant versions of go-dqlite always set this field to 0. The specific relaxations are:

Signed-off-by: Cole Miller cole.miller@canonical.com

codecov[bot] commented 1 month ago

Codecov Report

Attention: Patch coverage is 92.70073% with 10 lines in your changes missing coverage. Please review.

Project coverage is 81.01%. Comparing base (06992a6) to head (384712f). Report is 1 commits behind head on master.

Files with missing lines Patch % Lines
src/gateway.c 84.12% 2 Missing and 8 partials :warning:
Additional details and impacted files ```diff @@ Coverage Diff @@ ## master #709 +/- ## ========================================== + Coverage 80.95% 81.01% +0.05% ========================================== Files 196 196 Lines 29186 29287 +101 Branches 4089 4090 +1 ========================================== + Hits 23627 23726 +99 - Misses 3895 3899 +4 + Partials 1664 1662 -2 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

cole-miller commented 1 month ago

Note to self/reviewer: might be worth discussing whether we should skip in leader__barrier if allow_stale is set but we happen to be the leader. In the current version we do skip in this situation, and I think that's the right choice, but I could be convinced otherwise.

cole-miller commented 1 month ago

I just realized there's an issue I didn't deal with---if you connect to a node that happens to be the leader and open a DB with with the ALLOW_STALE flag, the connection will still be shut down when the node loses leadership. I'll fix that quickly...

cole-miller commented 1 month ago

Pushed a new revision incorporating feedback from @marco6---the semantics of the OPEN flag have changed, in particular it now blocks modifications even when the server is currently the leader. This makes the behavior of connections using this flag more predictable and pushes clients to choose the right kind of connection for their use-case. The flag has been renamed from ALLOW_STALE to READONLY to reflect this.