Open pkulchenko opened 1 year ago
Great idea, a mesh network database seems like a blockchain.
I'm dreaming a mesh network based on redbean, currently I'm using a customized caddy web server with http3 behind NAT, and I added 10 lines of golang codes into the caddy server to send udp packets to each others to penetrate the NAT and built a mesh network, but what a pity, redbean still dosen't support http3: https://github.com/jart/cosmopolitan/issues/595
Background
As I've been working on several state synchronization projects -- SyncX, which is an implementation of Sync9 in Lua and Cabinet, which is an implementation of shelf in Lua -- I came to realize that it may be worth extending the web framework I've been developing with SQLite-based data synchronization, as it already includes SQLite, has the network layer to send and accept HTTP/HTTPS requests and has a DB management layer that takes care of query management and schema upgrades.
Terminology
A quick word on terminology first. I'm using nodeid, peerid, and occasionally userid, which all mean slightly different things. nodeid is a unique identifier for each node/instance running the module and participating in synchronization. Each node can have multiple databases it's used to synchronize, but for the purpose of this document we can assume that it's one DB per node (without changing anything in the design). peer is one of the nodes that a particular node is directly interacting with with its identity and some version information associated with it being tracked.
SQLite Session extension
The original goal was to build the simplest synchronization mechanism that would take DB changes and send them to other peers to apply. Since SQLite already provides the session extension that allows tracking of changes to one database and packaging and applying those changes to another database, I decided to build the synchronization on top of that mechanism.
The session extension API that is needed for the synchronization to work is quite simple:
create_session()
andchangeset()
functions)apply_changeset()
function)When changes are captured, the session extension generates the smallest number of INSERT/UPDATE/DELETE statements that create the same result when applied.
The session extension also detects conflicts of various categories when changes are applied (that need to be resolved in different ways) and provides functions to inspect changesets if needed.
Design
The session extension needs to be configured to track changes, so the framework provides one function (
sync()
) to set up necessary tables and hooks to the current connections. When the sync() function is called, it adds methods that capture/apply changes and "promotes" the connection to provide the sync by installing the hook that captures the changes. It also creates two tables that are used to track the changes: one to track peers to exchange changes with and the other one to track generated and received changesets.All changes executed through the "original" connection are going to be versioned. If some operations need to be excluded, they can be executed with the connection returned by the
sync()
function. If some tables need to be excluded, they can be excluded using thefilter
option to thesync()
function.As the result of this setup, any combination of versioned or non-version tables is supported.
The schema changes are not versioned, but schema upgrades are provided by the upgrade() function and are done transparently for the user (as long as the schema follows some rules, as described below).
Only those nodes that have been configured with one or more "remotes" are going to send messages to those remotes. Only those nodes that have been configured with "userid" (node identifier) are going to capture local changes. A node can capture local changes, but don't send then anywhere, as it may accept requests from other nodes and then send the recorded changes in response to the request from other nodes.
This allows for a flexible and dynamic configuration with nodes accepting requests as desired and sending generated and received messages to other nodes (based on configured "remotes" values).
Changeset versioning
Versioning is based on hybrid logical clocks with the clock updated with each local operation and reset if the remote clock value is larger.
Only local changes are versioned. As existing rows or values are not versioned individually, deletes don't need to be tracked in any special ways (no tombstones).
Changesets can be purged when their acceptance is acknowledged by all peers. It's easy to check for the earliest version seen by all peers and delete all older versions (if needed).
Most of the time the received changesets are going to have a newer/later version, so they can be directly applied to the current state of the DB. In some cases, some local changesets may already record newer changes, in which case to apply received changesets, the local changesets need to be "reverted" and re-applied after remote changesets are applied.
In other words, if versions 1, 3, and 4 are applied locally and version 2 is received, then 4 and then 3 need to be reverted with the state restored as of version 1 with 2 then applied and 3 and 4 re-applied. All these changes happen within a transaction and the conflicts are resolved as described below.
Conflict handling
With the session extension sending old/new values for all UPDATE statements, it's possible to detect various conflicts that may happen with concurrent updates and need to be resolved for the changes to be consistently applied despite arriving in (potentially) different order.
Here is the conflict resolver function that is provided as part of the
apply_changeset()
call:Some of the received changesets may have conflicts that can't be resolved locally. For example, if the remote schema has been updated and added fields to a table, then an insert to that table can't be applied locally until the schema is updated. These changesets are marked as
PENDING
and are re-applied as part of eachsync()
call with the expectation that the local conditions have improved enough to correct apply changesets.Sending messages in the opposite direction (from peers with older schema to a newer one) is allowed, as sending changes for tables with fewer fields is accepted (assuming schema changes are adding fields, but not remove them).
Package format
I've considered using JSON, multipart/mixed, SQLite DB and a couple of other options, but for now settled on zipping the changesets into one archive with their versions being filenames, as SQLite can both generate and read/decompress zip archive, so it plays well with the goals of the project.
Requirements/Limitations
Sync-related constraints
Upgade-related constraints
Motivation/Goals