automerge / automerge-classic

A JSON-like data structure (a CRDT) that can be modified concurrently by different users, and merged again automatically.
http://automerge.org/
MIT License
14.75k stars 466 forks source link

Channel concept, granularity of documents and dependency tracking #31

Open ept opened 7 years ago

ept commented 7 years ago

We have previously discussed a few things we'd like to be able to do with Automerge documents:

I had a call with @pvh to discuss how best to implement these concepts in Automerge/MPL. The following are some notes on what we discussed.

As a first step, hash-chaining to encode the dependency graph (#28, like parent commit hashes in git) seems like a good idea. To find all changes that have gone into a document, start with one or more HEAD commit hashes, and traverse the dependency graph. When two communicating nodes have made concurrent changes, they won't know about each others' heads, so they'll need to run a multi-round protocol to figure out their latest common ancestor hash, much like git (#27).

This leaves the question of how you find out about changes to a document. Our proposal is to separate it into two concepts:

  1. A channel is a network abstraction for pub/sub. A change is published to a channel, and a node can subscribe to any number of channels. A channel has a unique identifier, e.g. a UUID. Channels are probably not visible to the end user, but only an internal abstraction.
  2. A document is a set of channels, and it incorporates all changes that appear in any of its channels. A document may exist only on one node, and is not necessarily shared with other nodes. To share a document, its set of channels should perhaps be written to a filesystem CRDT?

The features outlined above can all be implemented using those two concepts:

ept commented 7 years ago

Addendum: Once we have security features (encryption, authentication, access control), I imagine that channels would also be the unit at which permissions would be handled. For example, a user may have read/write access to the channels for their own documents, but read-only access to the channel for a template document. A user can create new channels just for themselves, or choose to grant other users read/write/admin permissions on their channels. From a crypto perspective, a channel would probably be the granularity at which the group key exchange happens.

ept commented 5 years ago

Last October I wrote up a related discussion as a separate document, after a discussion with Peter van Hardenberg and Jeff Peterson. Copying it here for future reference…

We are building applications on Automerge in which the user’s workspace consists not of one big Automerge document, but many separate, small documents (under the slogan “everything is a document”). This approach has a number of advantages:

However, we have been thinking about whether this structure is really the best one. The boundaries between documents could be shifted in two ways:

These two approaches are actually quite similar: both remove the current grouping of a bunch of objects into a document, which seems like a somewhat arbitrary structure to impose on an application’s data. There are a few reasons why we might want to get rid of the document concept as it currently exists: