Sync protocol implementation (second version)

ept commented 3 years ago

Here is a second attempt at the sync protocol discussed in #290, superseding the first attempt in #324, after discussion with @pvh @orionz @alexjg. It addresses several issues with the first protocol, in particular, making the API more functional (avoiding side-effects) and giving the networking layer more control over encoding and scheduling of communication.

There is now only one type of message, a SyncMessage with the following structure:

interface SyncMessage {
  heads: Hash[]
  need: Hash[]
  have: SyncHave[]
}

interface SyncHave {
  lastSync: Hash[]
  bloom: Uint8Array
}

type Hash = string // 64-digit hex string

When a node sends one of these messages, it has the following meaning:

heads: The latest change hashes known to the sender (where "latest" means "no other change depends on them").
need: The hashes of any changes that are being explicitly requested from the recipient.
have: A summary of the changes that the sender of the message already has, or is expecting to receive from another node. This is implicitly a request to the recipient to send all changes that the sender does not already have.

When two nodes connect to each other, they initially both send each other a SyncMessage to inform the other about their current state. If the two nodes are already in sync, there are no further messages. If the nodes need to send each other changes, they do so in response to this initial message, and the changes are also accompanied by another SyncMessage. Most of the changes are sent in this first response, but in some edge cases there can be a bit more back and forth of changes and SyncMessages until we have silence.

Whenever a node updates its document (because it updated the document itself, or because it received changes from another node), it informs all the nodes it is connected to by sending them each a SyncMessage, in exactly the same way as the initial message on connection. This triggers a fresh run of the protocol. Local changes can immediately be sent to all connected nodes, while changes received from another node should be exchanged through this protocol to avoid having a node unnecessarily receive the same changes several times from different nodes. If a document update happens in the middle of a sync protocol run, that's fine too.

Automerge implements the protocol with the following backend functions, four of which are new:

// New functions
function decodeSyncMessage(bytes: Uint8Array): SyncMessage
function encodeSyncMessage(message: SyncMessage): Uint8Array
function syncStart(state: BackendState, lastSync?: Hash[]): SyncMessage
function syncResponse(state: BackendState, message: SyncMessage): [SyncMessage, Uint8Array[]]

// Existing functions (getMissingDeps has been changed to take some extra arguments)
function getHeads(state: BackendState): Hash[]
function getMissingDeps(state: BackendState, changes?: Uint8Array[], heads?: Hash[]): Hash[]

getHeads() returns the current heads hashes of the document (as an array of hexadecimal strings).
encodeSyncMessage() and decodeSyncMessage() are straightforward: they convert between the SyncMessage object representation and a byte array representation of a message. They use a simple binary encoding that helps keep the protocol compact and efficient.
syncStart() is called when we first connect to another node, or when we want to inform another node about some new changes. The argument lastSync should be the heads hashes after the last completed sync with the other node (or left blank if this is the first sync with the other node). The function returns the initial SyncMessage to send to the other node.
syncResponse() is called when we receive a SyncMessage from another node, and the argument message is this message we received. The function returns a SyncMessage to send in response, and an array of changes to send to the other node, in their binary encoding (#253).
getMissingDeps() is an existing function that returns any change hashes that we don't have, but which are dependencies of another change. I have updated this function so that it can now also figure out missing dependencies in a set of changes that have not yet been applied to a document.

Note that none of these functions modify the document state: they are all side-effect-free. The only effect on a document happens when the recipient of some changes takes them and calls Automerge.applyChanges() to apply them to a document.

Using the protocol does require some additional state: in particular, it must accumulate any changes that are received (using getMissingDeps() to identify any missing changes), and update the document when the set of changes is complete. It also needs to remember the heads from the last completed sync, and it needs to take some steps to avoid duplicate messages (since the functions above are stateless, they don't know whether an identical change or SyncMessage has already been sent previously). The intention is that with the API provided, this becomes a fairly simple state machine.

A simple example of such a state machine appears in the class SyncPeer at the top of test/sync_test.js, and this state machine is used extensively in the tests. I have not made the state machine part of Automerge's public API, because I suspect that different networking layers will want to implement the protocol state machine slightly differently (e.g. a client-server WebSocket implementation may well have different needs from a peer-to-peer implementation). My hope is that the API in this PR is general enough to satisfy all those use cases, but also powerful enough that networking layers don't need to do too much extra work.

Some discussion points:

The API currently always returns a log of individual changes, even though if there are lots of changes to send it might be more efficient to send a whole compressed document. We can add support for sending whole documents later without changing the API, since the whole document can simply be a Uint8Array returned by syncResponse().
Since the API doesn't assume anything about the way changes get encoded and transmitted, networking layers have flexibility here, e.g. to provide progress bars for large uploads/downloads, or to prioritise one document over another.
Currently the functions are only exposed on the backend, making them a bit verbose to call (e.g. Automerge.Backend.decodeSyncMessage()). I think this is okay, especially since we will want networking layers to work both when frontend and backend on the same thread, and also when they are on different threads, so networking layers will need to be aware of the frontend/backend distinction in any case. The exception is getMissingDeps, which is already available on the top-level Automerge object — maybe we should change it to only make it available on the backend?
When we are concurrently syncing with two different nodes, we want to avoid requesting the same set of changes from both and receiving them twice. We can do this by taking the have field from a SyncMessage we received, and adding its contents to the have field of a SyncMessage we send. This indicates to the recipient of the second SyncMessage that we are already expecting to receive a certain set of changes from the sender of the first SyncMessage. If the changes from the first node don't arrive as expected due to a network interruption, we can make a fresh request (without copying the have field) after a timeout.
When we have a fast connection to some nodes and a slow connection to others, we want to make sure the networking layer is able to make full use of the fast connection, and not accidentally get stuck transferring things on the slow connection. We can do this in the proposed API by initiating two transfers concurrently, waiting for an initial subset of changes to come in from both nodes, and then cancelling the one that is slower. This cancellation can happen simply by sending the slower node another SyncMessage indicating that we no longer want a certain set of changes; when it receives that message, it flushes the queue of changes waiting to be sent.

Internally, each entry of the have field of a SyncMessage contains two things: lastSync are the heads at the time of the last successful sync with the recipient of the message, and bloom is a Bloom filter summarising all of the changes that the sender of the message has added since then. The details are explained in a paper. The API exposes Bloom filters only as byte arrays without any further structure, since the manipulation and interpretation of Bloom filters is completely encapsulated in the API.

syncResponse() returns all changes that it knows about and that satisfy at least one of the following conditions:

its hash is included in the need field of the received message; or
it is not one of the changes identified by any of the hashes in the have field of the sync message or their transitive dependencies, and none of the Bloom filters in the have field of the sync message contain the change; or
it depends transitively on a change that is sent as a consequence of the previous bullet point.

I believe that changes received from another node should not be applied to the document until all dependencies are satisfied. There are two reasons for this:

A missing dependency (a "hole") may never be filled in. Say you partially sync a change history with another node, but that node disconnects before all of the dependencies are filled in; moreover, that node never comes back again and never syncs with any other node either. We will then forever have this incomplete change history as no node will ever be able to fill in the hole; we will forever keep asking for the missing change in the need field, but it will never be resolved. Having such eternal holes also makes it difficult to determine when the sync with another node has completed. Perhaps we could give up on holes after a timeout, but that makes the protocol more complicated.
In the case where one node sends us an incomplete change history and then disconnects, it may seem like for efficiency reasons we should keep the partial change history we received from that node, and try to fill in the holes by syncing with another node. Even when this is possible, because another node has the missing changes, this actually doesn't work well with the Bloom filter sync protocol. The reason is that if one change is not present in the Bloom filter, we assume that all changes that depend on it are also not present. This assumption is important for getting good performance from the Bloom filters, since it drastically reduces the impact of false positives. With this rule about dependencies, any changes that depend on the hole will get re-sent anyway, even if they are present in the Bloom filter, and so the desired avoidance of re-sending changes we already have doesn't actually happen anyway.

Nevertheless, when performing a large sync, it would be annoying if we can't resume a partially completed sync that got interrupted due to an unreliable network connection, or if we can't display any data in an app until it has fully synced. To mitigate this, I think we could break a large sync into several smaller syncs, performed sequentially; then not much is lost if we disconnect during one of these smaller syncs. For example, each document or each head could be a separate sync. If there is a really long change history for a document, we could even break it up into chunks (say, sync 1MB of changes at a time, rather than jumping all the way to the heads in one sync).

pvh commented 3 years ago

A thought on "unfillable holes": perhaps these could be garbage collected during a save() compaction? This would let them hang around for a little while in the hopes of being restored but at whatever interval a developer chooses to run a compaction and save() the file they'd be polished away.

ept commented 3 years ago

@pvh With "garbage collecting holes", do you mean discarding all changes that depend transitively on the hole? That is possible, and in fact I believe the current save() already does this, since it only stores changes that are reachable from the heads, and changes with missing dependencies don't count towards the heads (they are only enqueued but not applied to the document). However, I'm still doubtful that it's a good idea to rely on this behaviour.

orionz commented 3 years ago

What about the idea to resolve holes with applyChanges rejecting changes with missing deps - (and specifying which holes).

orionz commented 3 years ago

Doing so means we can remove the getMissingChanges api call and push the responsibility of hole management to the developer

ept commented 3 years ago

What about the idea to resolve holes with applyChanges rejecting changes with missing deps - (and specifying which holes).

I think that would be a good idea; we've previously had cases of people getting confused when they called applyChanges and the document did not change because a dependency was missing. Making this an explicit error would be a better developer experience than silently buffering the change.

I'm not sure what the type signature for applyChanges should look like, though. We want it to either return a success status with an updated document, or an error status with a list of missing hashes. JavaScript exceptions are pretty bad so I don't really want to use an exception for the error status. Maybe we could make it return a kind of pseudo-tagged union type like this (in TypeScript types):

function applyChanges<T>(doc: Doc<T>, changes: Uint8Array[]): Success<T> | MissingDeps

interface Success<T> {
  status: 'success'
  doc: Doc<T>
}

interface MissingDeps {
  status: 'missingDeps'
  missingDeps: Hash[]
}

What do you think?

pvh commented 3 years ago

@ept a couple of minor questions/ideas

It probably makes thought to improve the consistency between SyncMessage and ChangeMessage in terms of API naming, Typescript types, and so on (I will send a PR.)
It feels weird that the messages don't have the type baked into the binary for the wire protocol. Should we include a leading byte / bit that determines the message type?
Should we add a more generic {encode/decode}Message pair of functions? (It could use that first byte to decide what message type it has.

My focus right now has been on taking the example code in sync_test.js and trying to implement it in my little testbed application and I think I'm getting close. You can see that implementation here: https://github.com/pvh/automerge-demo but I warn you that it is very WIP and while I'm trying to only commit working versions it sort of grows bloat and then reconciles it again and I might have hand-edited my Automerge tree without pushing those changes publicly once or twice so if it doesn't work that's absolutely my fault and I'll aim to fix that before calling it a day here.

dan-weaver commented 3 years ago

I think that would be a good idea; we've previously had cases of people getting confused when they called applyChanges and the document did not change because a dependency was missing. Making this an explicit error would be a better developer experience than silently buffering the change.

this may be a quirk of our application but this would currently create race conditions for us. We rely on being able to immediately listen to changes on an incoming signal and then we can rest assured it’s ok that our naive sync protocol finishes getting the entire doc whenever it finishes delivery.

Possibly that’s not a legitimate problem after the sync protocol is landed though?

ept commented 3 years ago

It feels weird that the messages don't have the type baked into the binary for the wire protocol. Should we include a leading byte / bit that determines the message type?

@pvh They do have a leading byte (0x42, chosen arbitrarily) to indicate the message type. Happy to pick a different number if you don't like that one.

Should we add a more generic {encode/decode}Message pair of functions? (It could use that first byte to decide what message type it has.

Could do, although since we only have one message type at the moment, that seems a bit redundant.

We rely on being able to immediately listen to changes on an incoming signal and then we can rest assured it’s ok that our naive sync protocol finishes getting the entire doc whenever it finishes delivery.

@dan-weaver You could solve this by buffering changes until all the dependencies are satisfied. Something like this:

let receivedChanges = [], doc = Automerge.init()

function onReceiveChanges(changes) {
  receivedChanges = receivedChanges.concat(changes)
  const result = Automerge.applyChanges(doc, receivedChanges)
  if (result.status === 'success') {
    doc = result.doc
    receivedChanges = []
  } else if (result.status === 'missingDeps') {
    // do nothing, try again on next call to onReceiveChanges()
  } else {
    throw new Error(`Unexpected result status: ${result.status}`)
  }
}

…which would give you the current behaviour.

pvh commented 3 years ago

treatment of change messages vs. sync messages feels inconsistent

one is passed around as opaque binary, the other not
method names & type signatures vary a bit

asking developers to manually check getMissingDeps feels incorrect waiting for a sync message to arrive before applying changes feels wrong too (particularly for single change messages) -> why shouldn't we apply all applicable changes and only publish the patch at a "suitable time"? waiting to update sync state until full sync is complete feels incorrect -> if i am getting 10,000 changes from a peer how do i know how many changes i'm waiting on for a progress bar?

naming & types

syncStart() is a strange name to use
waitingChanges shouldn't be per-peer

network protocol

it would be nice to have a flag on the wire message indicating type for changes as well
a decodeMessage (and perhaps encodeMessage) could be added to the API

febe protocol

a type for the communication would be simple & future proof
perhaps: frontend.receiveMessage(backendMessage) -> this reminds me to think about multiple front-ends per backend

sync state

should we be storing sync state between runs?
how does it get corrected if we're wrong about the remote peer?
(we only update sync state on successful sync right now)

changes

how will this protocol work if we can pack multiple changes together?
what does sending a full document look like?

There are probably a few other smaller pieces but this covers a lot of it.

arj03 commented 3 years ago

This looks very interesting.

I was wondering if you have thought about extracting this functionality into a separate library so it can be reused in other projects? It seems that the API should be general enough.

Local changes can immediately be sent to all connected nodes, while changes received from another node should be exchanged through this protocol to avoid having a node unnecessarily receive the same changes several times from different nodes.

I was wondering if you have considered using a spanning tree like protocol such as plumtree, epidemic broadcast trees or Brisa for replication changes with multiple nodes. See this for more information.

ept commented 3 years ago

Hi @arj03,

I was wondering if you have thought about extracting this functionality into a separate library so it can be reused in other projects? It seems that the API should be general enough.

Potentially, although it does make some assumptions that may be specific to Automerge, such as the API provided for traversing the hash graph. My inclination would be to keep this functionality as part of the Automerge core for now, and we can consider splitting it out later if there is demand.

I was wondering if you have considered using a spanning tree like protocol such as plumtree, epidemic broadcast trees or Brisa for replication changes with multiple nodes.

I've looked at a few gossip / p2p overlay networks; our intention is that the API provided by Automerge core should be general enough to be used for many different types of network topology. There can then be several different networking layers (some using client-server websocket, some using p2p protocols, etc) that build upon the same core API. Apps can then choose a networking layer that is appropriate for their use case.

arj03 commented 3 years ago

Yeah I guess it makes sense to keep it in automerge especially as you fine tune the algorithm so that it fits the problem at hand as best as possible instead of trying to solve many different problems all at once.

I mentioned the articles because they specifically deals with the timeout and how to handle data from multiple peers problem. It is good to know you are aware of existing literature in that space. What I'm trying to say is that it seem like a separate concern from the main algorithm, again in automerge it might make sense to keep those two aspect very close because an end user doesn't really care about those details.

Anyway just wanted to provide some input, what you are doing here is really exciting!

lightsprint09 commented 3 years ago

Can we close this?

ept commented 3 years ago

@lightsprint09 Yes. Closing this because it is superseded by #339.

automerge / automerge-classic

Sync protocol implementation (second version) #332