automerge / automerge-repo-rs

MIT License
39 stars 6 forks source link

On the semantics of save and sync interaction #47

Open gterzian opened 1 year ago

gterzian commented 1 year ago

I am looking into adding another example, this time implementing Paxos.

The algorithm requires saving certain updates to stable storage, and only sending them as messages to peers when the updates have been flushed to disk--necessary precautions to make the algorithm tolerant to crashes.

In our current setup, whenever the document changes, saving that document via the Storage implementation and syncing it happens concurrently. In the case of a crash, either one, both, or none may have succeeded. If a sync message would have been sent out before the changes were durably saved, a subsequent crash could break the safety of the algorithm.

Now, since users will have different needs with regards to the interaction between sync and save, I propose we define an API through a contract between the repo and storage, expressed as the following invariant:

For a given change, no sync messages will be sent out until the corresponding save future has resolved.

This would give implementers of Storage the flexibility to resolve the save future as early as they want, and in the above case to only resolve once the changes have been fully flushed to disk(one can imagine a setup with two documents, one backed with more a more volatile storage, the other backed by one where each changes are flushed to disk before resolving the future).

Internally, this would require matching each save future to a change, here.

Externally, this would require changing how the Sync protocol works, by adding a concept of "sync up to this change only", where "this change" would be "the last one for which save returned a future that has been resolved".

gterzian commented 1 year ago

cc @alexjg