automerge / automerge-classic

A JSON-like data structure (a CRDT) that can be modified concurrently by different users, and merged again automatically.
http://automerge.org/
MIT License
14.75k stars 466 forks source link

Range deletions proposal #401

Open ept opened 3 years ago

ept commented 3 years ago

At the moment, every deletion of an element from a list, and every deletion of a character from a text, is a separate operation. This means that if a range of elements is deleted, and concurrently another user inserts elements into that deleted range, the insertions still take effect. That can have surprising effects (thanks to James Allen for the example):

Perhaps it would be better to change the behaviour so that if a consecutive range of characters is deleted, then any concurrent insertion within that range is swallowed, so the outcome would be "hello world" in the example above. I just tested this example with Google Docs, and it does indeed swallow the concurrent insertion. I'd like to hear from the community which behaviour seems preferable.

I think range deletions are not always appropriate: for example, if you delete the first two items from your to-do list on one device, and concurrently add a new item between the first two items on another device, then it seems reasonable that the inserted item should survive. This is because each to-do list item is an independent object that makes sense by itself, whereas a character in a text rarely makes sense without its immediate context.

Thus, I propose not changing the semantics of existing single-element deletions, and only adding a new "range deletion" operation type. That way, range deletion is a new feature, not a breaking change (i.e. loading the same document containing single-element deletions in different Automerge versions still results in the same state), and therefore we can ship it with a minor version bump after 1.0.

If we go for it, I suggest that range delete operations should be generated only when you delete all the characters or list elements in a single Automerge function call (Text.deleteAt(index, count) or Array.splice(index, count) with count > 1). If you want to keep the existing semantics of treating every list element independently, you can loop over the elements to be deleted, and call deleteAt individually for each one (or if that turns out to be really slow, we could add an optional argument to deleteAt).

Any thoughts?

pvh commented 3 years ago

Yes, this seems correct to me!

josephg commented 3 years ago

Oh I like this! I hadn't thought of that, and I think its very clever. I wonder how this would interact with git style asychronous editing - like, if I delete a couple functions, and you edit one of those functions, what should the behaviour be?

pvh commented 3 years ago

The best answer would be "whatever typechecks" but in the small, I'd say that the behaviour described above (delete the span and drop the edits) is correct. Since a CRDT preserves all the history you aren't really losing anything anyway, you can always just pop over and recover anything you wanted if you change your mind.

burdiyan commented 3 years ago

I think it's a very reasonable solution. It reminds me a bit about the problem of formatting ranges. I don't know if the same thing happens with formatting text in Automerge right now, but it seems reasonable for it to be so. For example formatting a word in bold would format the concurrent insertions in the middle of the word as well.

I like that range deletions can be a non-breaking change.