superfly / litefs

FUSE-based file system for replicating SQLite databases across a cluster of machines
Apache License 2.0
3.97k stars 95 forks source link

Object Store Mediated Replication #434

Open pg-wtatum opened 3 months ago

pg-wtatum commented 3 months ago

Related to #18, it would be extremely powerful to be able to replicate data to read-replicas outside the datacenter infrastructure which could not securely or performantly access the HTTP endpoints of the litefs master directly. My proposed use-case if offline first web and mobile clients that can tolerate somewhat higher latency in their replication than the current method used for primary to read-replica, but still want an efficient and repeatable method for replicating database state from a primary.

If #18 is implemented then it feels like a very powerful added capability this would offer is for detached clients to use S3 as a method to get "in sync" with the primary (minus a time delay) without needing to manage any connectivity with the primary. The major additional effort to support this would be the ability to trigger a "restore" operation of the backup independent of the rest of the litefs plumbing (and most likely from an application sdk rather than via filesystem operations). Ideally this would only require the client to have S3 coordinates for the various objects holding the backup but not any other litefs config.

pg-wtatum commented 2 months ago

I raised this before spotting https://github.com/superfly/litefs/pull/315 -- which makes it sound like the client for streaming writes via LTX to an external S3 (or the filesystem) is already in place. Would it be correct to assume that since #18 isn't closed that there's not a straightforward method to run a restore? For my use-case the absolute ideal would include two other capabilities:

As I'm asking I recognize that this is likely a lot but since it sounds like it aligns with existing project goals just clarifying if this is actually something that's on the roadmap?