This PR introduces a flight-compatible protocol for syncing record batches from a remote source.
How
It facilitates the arrow flight do_put call to upload record batches representing data changes in a remote system. It also passes a command containing the metadata about the action.
These are in turn stored in a cache, which for a given lag and size based criteria will flush the batches from memory to object storage.
TODOs (here or in follow-on PRs)
DELETE actions are not wired up atm, meaning only INSERT (or the append part of an UPDATE) actions are correctly replicated
even the append support is not idempotent (will duplicate entries for same PKs if replayed)
no concurrency/background writes
lag-based replication is triggered in a push-like manner, meaning if there's no activity on the interface at all batches will remain dormant in memory indefinitely
a bunch of unit and integration tests, particularly focusing on read/write partition pruning (though mostly critical for point 1 above)
What
This PR introduces a flight-compatible protocol for syncing record batches from a remote source.
How
It facilitates the arrow flight
do_put
call to upload record batches representing data changes in a remote system. It also passes a command containing the metadata about the action.These are in turn stored in a cache, which for a given lag and size based criteria will flush the batches from memory to object storage.
TODOs (here or in follow-on PRs)