automerge / hypermerge

Build p2p collaborative applications without any server infrastructure in Node.js
MIT License
1.28k stars 66 forks source link

Partial update events #6

Closed t-mullen closed 6 years ago

t-mullen commented 6 years ago

The typical usecase for a CRDT library like this would be to sync a "vanilla" plaintext document model over the network. This library looks great (the network being handled by DAT is brilliant), but it's missing observers for fine-grained changes.

hm.on('document:updated', docId, doc) partially meets this, but there's no indication of what has changed in the document, so I need to store a previous version of the document and diff them every change (which is infeasible).

What's needed is an update event that gives the range removed, old text, and new text. Batching these events would be a plus.

jeffpeterson commented 6 years ago

Aha, interesting.

So, the way we can get a (useful) diff of what has changed is via Automerge.diff(oldDoc, newDoc). See the docs for the output format.

Keep in mind that this is O(n) with respect to the number of additional changes in the new document. It doesn't do a structural diff of the entire document, so it should be pretty performant.

I could add the previous version of the doc to the event if it makes this simpler:

hm.on('document:updated', (docId, doc, pDoc) => {
  const diff = Automerge.diff(pDoc, doc)
  // Do something with diff...
})
t-mullen commented 6 years ago

That would help a lot! So the structure of the doc object makes this performant? That might meet my requirements.

Thanks, I'm interested in trying this out in multihack. Hopefully it'll fix some sync issues we've been seeing.

jeffpeterson commented 6 years ago

Yep. Under the hood (doc._state), Automerge maintains a list of all changes that have been applied to the doc. Automerge.diff simply fetches any changes missing from the old doc and returns them in a developer-friendly format.

Keep in mind, if you want diffs within strings, they'll need to be stored as Automerge.Text.

I'll go ahead and publish a new version with that change, I think it'd be generally useful for other folks as well.

jeffpeterson commented 6 years ago

Multihack looks cool! I should point out, though (which evidently the Readme does not) that hypermerge only works in node :/ Dat's discovery-swarm requires opening TCP sockets, which can't be done in-browser.

Automerge itself works fine in the browser, and given a socket connection to a server or other clients, can be wired up fairly easily with Automerge.Connection. But, it's definitely more work to integrate than hypermerge.

But, FWIW, the new version is published: hypermerge@0.4.0.