SharedTree: Merge/Rebase Design

This the continuation of an email thread (Scenarios for "sandwich" rebases) between @ruiterr, @jack-williams, @DLehenbauer, @taylorsw04, @PaulKwMicrosoft, and @yann-achard-MS. The aim of the discussion is to highlight challenges in the area of edit merging and conflict resolution as well as propose potential solutions to said challenges. Only the most recent exchanges have been migrated. Earlier content may be added as needed.

Relevant Terms

Rebasing of Dependent Edits When rebasing local edits over incoming edits (or when rebasing a branch onto a base branch), there may be several local edits to be rebased. For example, a client may have a pair of local edits U and V where V was authored in the context brought about by the application of U. If an edit A was made concurrently to U and V, and edit A was sequenced prior to U and V, then U and V need to be rebased over A. The rebasing of U is relatively simple because U and A where authored relative to the same context: we rebase U over A to produce a new edit U'. Edit V however, cannot be so simply rebased.

Sandwich Rebasing Sandwich Rebasing is an approach for dealing with the challenge of rebasing dependent edits. At its most abstract, the idea is to rebase each dependent edit over the following groups of edits:

The inverse of the prior local edits it depended on
The incoming concurrent edits
The rebased version of prior local edits it depended on

For an edit V dependent on a prior local edit U, both which are being rebased over an incoming edit A, the rebase of V would be V' = rebase(V, [U⁻¹, A, U']) where U⁻¹ is the inverse of U. The term "sandwich" stems from the symmetry between U⁻¹ and U' on either side of A.

Domino Doctrine The policy of automatically considering dependent edits from a client as conflicted if the prior edit from the same client (on which the dependent edit depends) is itself conflicted.

Prior Correspondence

From: @ruiterr Sent: March 18, 2022 Hi,

as already mentioned in a meeting last week, I think that supporting both associativity and cancellation for remove / insert combinations will require either the addition of unique IDs for removes or restricting the associativity of squashes.

I spent some more time thinking about this and created an example that illustrates the problem. I would like to hear your opinions.

Counter Example Here is the example that demonstrates the problem to get cancelation correct and still preserve the associativity:

Let’s assume we have the following changes:

A ○ B ○ C ○ C⁻¹ ○ B⁻¹ ○ A⁻¹

with:

A = [remove(“ABC”)] B = [remove(“ABC”)] C = [6, remove(“ABC”)] C⁻¹ = [insert(“ABC”)] B⁻¹ = [insert(“ABC”)] A⁻¹ =[6, insert(“ABC”)]

Depending on the order we squash the changes, we might potentially get different results. Let’s first look at the most simple case:

(A ○ B ○ C) ○ (C⁻¹ ○ B⁻¹ ○ A⁻¹) = [remove(“ABCABCABC”)] ○ [insert(“ABCABCABC”)] = []

in this case, both sides remove/insert exactly the same string at the same position and thus the algorithm could easily detect the cancelation.

However, if we now apply the changes in a different order, the situation is more difficult:

((((A ○ B ○ C) ○ C⁻¹) ○ B⁻¹) ○ A⁻¹) = ((([remove(“ABCABCABC”)] ○ ([insert(“ABC”)]) ○ [insert(“ABC”)]) ○ [6, insert(“ABC”)]))

For the first squash above, we now have the removal of the string “ABCABCABC” squashed with an insert of “ABC”, both at at 0. So there are three different solutions how this squash could be resolved: [3, remove(“ABCABC”)] , [remove(“ABC”), 3, remove(“ABC”)], [remove(“ABCABC”)]. If we choose the correct one, where the string cancels out in the middle, the next two inserts will also match and we get the correct result. On the other hand, if we choose one of the other two options the latter squashes will not result in perfect cancelation.

Let’s look at the first case. Here we get as the next operation:

([3, remove(“ABCABC”)] ○ [insert(“ABC”)]) ○ [6, insert(“ABC”)]))= [insert(“ABC”), 3, remove(“ABCABC”)] ○ [6, insert(“ABC”)] = [insert(“ABC”), 3, remove(“ABC”)]

Since the insert in the second step refers to a position before the skip, it cannot correctly cancel any longer and remains in the changes. The resulting changes still describes the correct data modification (if you apply it to “ABCABCABC” it will still give “ABCABCABC”), but it is no longer an empty ChangeSet.

This proves that associativity cannot be maintained without adding additional information. The squash had three different possible results. If you now swap the order of the operations A,B, C, each time a different one would be the correct one, but the input to the algorithm would always be the same (remove(“ABCABCABC”) and insert(“ABC”)). So from this information alone, it is not possible to choose the correct resolution.

Another potential approach would be to retroactively fix the situation in the second squash (the one with skip 3 and the insert before the skip). However, in this case, the fact that we can swap the insert behind the skip relies on the fact, that the skip 3 refers to the string “ABC”. Only in that case, it is valid to swap the skip and the insert. This information is no longer present in the second squash and thus we cannot fix it here, unless we preserve the information what we skipped over.

So as far as I can see, there are now two different options: • We compromise with regard to the associativity. Concretely, this would mean that we always apply the operation and its inverse together (we would have some type of sandwich operation that takes three operations and computes A ○ B A⁻¹ or A⁻¹ ○ B ○ A and makes sure the correct squashing happens. I think this is very similar to what Yann proposed in the last meeting, where the operation additionally also included the rebase. It is also very similar to what we currently do in PropertyDDS, where we also always apply a change and its inverse and store some additional information in a runtime data structure to make sure cancelation works. • We put additional information into the ChangeSets to enable the squash algorithm to find the correct squash resolutions. There might be multiple options which information we could use for this. One possible approach would be, to put in a unique identifier that identifies the remove and refer to this identifier in the subsequent insert. With this information the squash could choose the correct resolution and thus the full cancelation would work out.As mentioned above, inserting the data we are skipping over, might be a potential alternative. Do you have any other suggestions what information would be a good choice?

We now should look at the different tradeoffs we are making, if we choose between the two options.

Compromising on Associativity

If we compromise on associativity, that will mean, that we either have to compute squashes in a very specific order or might get non-optimal squash ChangeSets. This requirement to perform the squash in a specific order is especially a problem, if you would like to build a hierarchical acceleration structure that allows you to compute ChangeSets between arbitrary states in the graph. Such a data structure relies on the fact that you can pre-compute short-cuts through the graph of ChangeSets. Without the ability to precompute those, you are forced to linearly walk along the chains to find matching change pairs and apply them together (or at least we would not be able to skip over any changes that are inverted later and thus the length of the hops would be shortened accordingly).

So which use cases do we have, where we need to compute squashed changes? At the moment, I see the following scenarios, but there might be others:

To compute the base change for rebases in interactive collaboration sessions
To compute the modifications to report to the UI
To compute the changes for the rebase in a merge
To compute the merge links in the graph
Catch up after offline periods / connection loss
History queries by users

For 1,3 and 4, we need perfect cancelation to get the correct merge results. For 2 and 5, it would be nice for efficiency reasons, but not quite mandatory. I am not entirely sure, what user requirements for history queries in 6 would be, but it would certainly be better if we could provide accurate changes.

In the cases 1 and 2 above, we are computing the ChangeSets locally anyways and here it would be no big problem to apply them in the correct order / use a special sandwiching operation. For merges (scenarios 3 and 4) , we might potentially need to compute changes over larger parts of the graph and here it might be advantageous to use a hierarchical acceleration structure to make the computation of those cheaper. If we insert the correct merge links and merge from the lowest common ancestor, pairs of inverted operations might typically not appear on the two sequences that are compared, but there certainly could be such pairs, if users explicitly inverted operations during undos.

For 5 and especially 6, the ability to precompute the ChangeSets in a hierarchical structure is essential (especially to support partial history queries, where only the changes to a subtree are requested). To support correct cancelation for those queries, we would need to preserve the associativity. For the others, we might potentially compromise with regard to associativity.

Adding additional information into the ChangeSets

As mentioned above, we would have to add information that enables us to identify pairs of remove and insert operations. I think for the removes themselves, we potentially wouldn’t need to add any additional information. If we have unique ID for the operation / change that would suffice (this could be sequence ID plus branch ID but that might make things a bit more difficult because those are only assigned once the change is sequenced. Just assigning a unique ID on submission might be simpler). The remove operation within the changes could potentially be referenced by its position in the ChangeSet (counting from the beginning or via the offset at which it is applied).

For each insert, we would have to refer to the corresponding remove operation. Again this could be optimized by referring to the global ChangeSet ID once in the header of the ChangeSet and only having a local identifier within the insert operation. This information also would only be added for ChangeSets that resulted from an inversion operation.

So the actual operations, the information that we have to add would not be that big and shouldn’t pose a big hurdle. However, when we add this information, we will have to preserve it during squashes and this has unfortunately more consequences. Especially, it means that we won’t be able to join adjacent remove / insert operations, since we have to preserve the IDs on the individual operations.

Fortunately, this can’t cause an unbounded growth of the ChangeSet, but worst case, we still could have to insert an ID for every single removed character in a long sequence.

Overall, I think this would be somewhat similar to the approach taken by MergeTree, which (if I understand the algorithm correctly) also assigns unique IDs to sequences to later be able to match corresponding operations.

There might be one optimisation to avoid this. We could say, we only add the unique IDs if there is actually a duplicated sequence within the remove. If there is none, we would be able to find the correct position for cancelation by searching through the remove sequence to find the correct insert position which would result in cancelation. However, this would only work, if we assume that the inserted sequence hasn’t been modified afterwards. In such a case, we would want the edit to remain, but the insert and remove to cancel out. We could not detect this situation, because we could no longer match it accurately to the initial changes (without an n² minimum edit computation, which also doesn’t guarantee a unique solution). To solve this, we would need to keep the inserted value and the modified value distinct in the squashed ChangeSet. This would be similar in price to what we already do for the modify alone (if it has not been squashed with an insert), since we need to preserve the old value to keep the ChangeSet reversible.

An alternative to IDs on the remove sequences, would be unique IDs on each individual element. Since we intend to support unique IDs on elements anyways, this might be an alternative way to resolve the ambiguities. However, I think in many cases, those IDs are only available via a lookup in an additional data-structure, which we would like to avoid within the squash itself and I assume that adding them to every ChangeSet would be more expensive than the remove sequence IDs described above.

Conclusions As far as I can see, the decision will mainly depend on whether we would like to have correct cancelation in history queries. The other use-cases could potentially also be solved by compromising on associativity (however, also with an impact on performance and complexity). I currently lean towards adding the necessary information to ChangeSets to enable the correct computation of squashes, since this at least doesn’t seem to impact the asymptotic complexity and might be optimized in many cases, as long as there are not many repeating sequences within a single remove.

What are your thoughts on this? Do you see alternatives to the IDs on removes I suggested?

From @yann-achard-MS Sent: 18. March 2022

Hi Roland,

Thank you for taking the time to write all this down.

About Cancellation of Edits

One thing I want to clarify and establish some terminology for is the situation in which we expect some change and its inverse to cancel out. In the original HFDM approach there are two places where that can occur, which I’m going to nickname “effective” and “silent” cancellation.

Let’s consider a scenario where a local branch contains edits U and V while the main branch contains a new edit A that U and V are effectively concurrent to. In such a scenario, we can think of the local client’s history as looking like this:

U ○ V ○ [V-1 ○ U-1 ○ A’ ○ U’ ○ V’]

Where the portion in […] represents the UI update (and what would be used to rebase a theoretical change W into W’). The local client’s history may not actually look this, but conceptually this is what the UI experiences.

Effective Cancellation

This happens when the left side of the sandwhich is expected to cancel out with the matching original local edit that comes before it. For example if V inserts some contents but V’ doesn’t insert that content anymore, then we expect that V and V-1 will cancel out such that the net impact of the UI update is to delete that content inserted by V. One might argue that V and V-1 don’t really cancel out because V would already have been applied so we’re not squashing it with V-1, but the effect on the data and the effect on later changes that would be rebased over V then V-1 do effectively cancel out.

Silent Cancellation

This happens when the left side of the sandwhich is expected to cancel out with the matching right side of the sandwhich. For example if V inserts some contents and V’ also inserts it (in the same place), then we expect that V-1 and V’ will cancel out such that the net impact of the UI update is nil (at least as far as V is concerned).

If I understood correctly, in HFDM, silent cancellation is what is relied on to produce the adequate UI update when a local change is unaffected by the changes on the main branch. By contrast, in the approach I’m proposing, silent cancellation never happens. This is because we only produce a sandwhich-y looking UI update when the two slices of bread do not cancel out. So in the example where V is unaffected by A (and unaffected by the effect of A on U) the conceptual history would look either like this:

U ○ V ○ [U-1 ○ A’ ○ U’]

Or, if U is also unaffected by A, then it would look like this:

U ○ V ○ [A’]

Let me know if you see detect some issues with this. Also feel free to suggest other names than “effective” and “silent”, I just needed to pick something.

About Squashing

One thing other thing I should mention: I’m currently picturing commits as composed of frames which do not get squashed together. I use the term “frame” (I’m open to other names) to describe a fully squashed changeset or a fully squashed constraint set. A single commit may contain several such frames of either or both types. There are situations (such as computing a “catch-up” changeset which will never be rebased or rebased over) where the change frames can be squashed together (and the constraint frames discarded), but generally speaking a complex rebase ends up concatenating frames instead of squashing them together.

It may be possible to do more squashing even in cases where all information needs to be preserved if we’re willing to make the squashing and rebasing more complex. I’m not looking into it just yet as getting things to work at all is proving to be quite the challenge.

This sort of make the associativity question a little moot (for now) in the sense that there is no squashing happening.

Further comments:

A = [remove(“ABC”)] B = [remove(“ABC”)] C = [6, remove(“ABC”)] C⁻¹ = [insert(“ABC”)] B⁻¹ = [insert(“ABC”)] A⁻¹ =[6, insert(“ABC”)]

I’m confused by A, B, and C: are they meant to be sequential edits (meaning that each edit was performed with knowledge of the previous one)? If so, then shouldn’t (A ○ B ○ C) be [remove(“ABCABC”), 6, remove(“ABC”)]? And if not, then aren’t edits A and B deleting the same contents (which is a conflict in PropertyDDS but would make (A ○ B ○ C) equivalent to [remove(“ABC”), 6, remove(“ABC”)] in SharedTree)?

For the first squash above, we now have the removal of the string “ABCABCABC” squashed with an insert of “ABC”, both at at 0. So there are three different solutions how this squash could be resolved: [3, remove(“ABCABC”)] , [remove(“ABC”), 3, remove(“ABC”)], [remove(“ABCABC”)]. If we choose the correct one, where the string cancels out in the middle, the next two inserts will also match and we get the correct result. On the other hand, if we choose one of the other two options the latter squashes will not result in perfect cancelation.

My current approach has been to include enough information to ensure that the correct option is always chosen. A lot of that information is needed for correct merge outcomes in the first place.

Another potential approach would be to retroactively fix the situation in the second squash (the one with skip 3 and the insert before the skip). However, in this case, the fact that we can swap the insert behind the skip relies on the fact, that the skip 3 refers to the string “ABC”. Only in that case, it is valid to swap the skip and the insert. This information is no longer present in the second squash and thus we cannot fix it here, unless we preserve the information what we skipped over.

While it works out here, relying on the fact that the skip 3 revers to the string “ABC” is generally speaking not safe from an intention-preservation standpoint. There are two reasons for this:

Nodes are not fungible. If someone is trying to insert content after some node that represents the letter “B” then we need to make sure we don’t end up inserting next to some other node that happens to also represent the letter “B”.
When looking at the bread slices of a sandwich, the overlap of a change segment with its inverse does not guarantee that the two should cancel out. This is a consequence of moves. Here is a simplified scenario (to the point that it’s a little ridiculous) where this comes up: a. In a trait [A B] where user 1 and 2 are making edits concurrently (with edit 1 being sequenced first): b. Edit 1 (by user 1): move (empty) slice A[…]B before B c. Edit 2 (by user 2): insert X Z after A (with a commutative anchor) => should lead to state AxzXZB (using lowercase letters for tombstones) as opposed to state AXZB (which would be what you’d get if you cancelled-out xz and XZ). d. Edit 3 (by user 2): insert Y after X (with no-commutative anchor) => should lead to state AxYzXZB (which is only possible if xz and XZ are not cancelled out together).

I’m not sure if you were aware of all that and were just using “skip 3 refers to the string “ABC”” as a shorter description. Either way it’s good to call these challenges out.

So as far as I can see, there are now two different options: • We compromise with regard to the associativity. Concretely, this would mean that we always apply the operation and its inverse together (we would have some type of sandwich operation that takes three operations and computes A ○ B A⁻¹ or A⁻¹ ○ B ○ A and makes sure the correct squashing happens. I think this is very similar to what Yann proposed in the last meeting, where the operation additionally also included the rebase. It is also very similar to what we currently do in PropertyDDS, where we also always apply a change and its inverse and store some additional information in a runtime data structure to make sure cancelation works. • We put additional information into the ChangeSets to enable the squash algorithm to find the correct squash resolutions. There might be multiple options which information we could use for this. One possible approach would be, to put in a unique identifier that identifies the remove and refer to this identifier in the subsequent insert. With this information the squash could choose the correct resolution and thus the full cancelation would work out.As mentioned above, inserting the data we are skipping over, might be a potential alternative. Do you have any other suggestions what information would be a good choice?

You’re correct that in our conversations I was thinking of applying the operation and its inverse together but I didn’t mean to be prescriptive. In other words I haven’t yet given up on associativity. As mentioned higher I am currently focusing on the “add additional info” approach. Right now the additional information consist of a sequence (i.e., commit) number (which is needed for tombstones in general) and an op ID (which I think is what you’re proposing bellow). Going forward I expect we’ll need something more complex to deal with the fact that an edit may experience multiple merges (e.g., merge branch C into branch B, then merge branch B into branch A).

Overall, I think this would be somewhat similar to the approach taken by MergeTree, which (if I understand the algorithm correctly) also assigns unique IDs to sequences to later be able to match corresponding operations.

This is also how I think of it.

There might be one optimization to avoid this. We could say, we only add the unique IDs if there is actually a duplicated sequence within the remove. If there is none, we would be able to find the correct position for cancelation by searching through the remove sequence to find the correct insert position which would result in cancelation. However, this would only work, if we assume that the inserted sequence hasn’t been modified afterwards. In such a case, we would want the edit to remain, but the insert and remove to cancel out. We could not detect this situation, because we could no longer match it accurately to the initial changes (without an n² minimum edit computation, which also doesn’t guarantee a unique solution). To solve this, we would need to keep the inserted value and the modified value distinct in the squashed ChangeSet. This would be similar in price to what we already do for the modify alone (if it has not been squashed with an insert), since we need to preserve the old value to keep the ChangeSet reversible.

This sounds interesting but I’m having a hard time following. A concrete example would probably help. Is this predicated on looking at the contents being inserted and deleted? How would that work for move operations? I guess we could make the case that this optimization would only work for insert/remove which is still better than nothing.

What are your thoughts on this? Do you see alternatives to the IDs on removes I suggested?

We have yet to fully harmonize on “frames” vs. squash stuff, on my confusion around the examples, and there’s the effective vs silent cancellations, but none of that fundamentally changes any of the above, so I think we’re on the same page.

From: @ruiterr Sent: March 23, 2022

Hi Yann,

I was only referring to the case you called “silent cancelation”. With cancelation I meant that two edits in two ChangeSets that are squashed are removed, resulting in an empty change / a skip. So in the case you called “Effective cancelation”, there wouldn’t be any cancelation according to my definition (since the inverse change is removed in the rebased ChangeSet and thus the operations from the inverse change just remains in the ChangeSet).

As you point out, HFDM relied on the cancelation to produce efficient UI updates that did not cause redundant remove/insert operations. The cancelation was also needed to get correct rebase results. E.g. without cancelation rebasing a modify operation M with respect to an insert I would not have worked. The I-1 would contain an remove and rebasing M with respect to I-1 would have removed the modify operation, so that applying M’ after I’ would no longer perform the modification.

I agree that these two cases could be solved by having a specialized sandwich rebase+apply operation (how did you call this postbase?), that internally detects the cases where cancelation (according to my definition, “silent cancelation” to yours) would occur and wouldn’t include any change in the resulting ChangeSet.

The argument I made below, was not only about the sandwich rebase case, but considering the more abstract question whether we could achieve the group axioms for squashes and inversion and whether it would be worth paying the price to get cancelation. Apart from the sandwich rebase case, there might be other cases where we would like to get the correct result with cancelation (especially history queries, but also the computation of ChangeSets for merges and merge links). HFDM and also the PropertyDDS currently do not have this capability. The information to enable cancelation is only preserved during the rebase and lost afterwards (which is equivalent to the rebase operation you are proposing).

So the question would be, in which other cases (apart from the local sandwich rebase) we would encounter changes in the history that we would like to cancel out and are those important enough to warrant the inclusion of the extra information needed for cancelation?

Undo The most obvious case is a situation where a user explicity applied an undo operation and submitts an inverse change (within a single linear history, no rebases at all). In such a case, it would be desirable to have the undo operation cancel out. For example, to have another change that is rebased with respect to this operation be correctly applied, but also to show a correct diff and to avoid that clients catching up have to do unnecessary UI updates.
Changes between states in parallel branches If we compute the changes between two states in parallel branches, we walk backwards in one branch (resulting in inverse changesets) and forward in the other branch. So if a (unchanged or rebased) operation occurs in both branches, we would have cancelation. I think, if the only operation we allow in the graph are merges and we always compute differences across the shortest path with respect to the merge links, we should not get into a situation where an operation occurs on both parallel branches. However, if we allow rebase operations or cherry-pick operations we could produce situations where such a query encounter cancelation. It would also happen, if we compute the changes along the merge link itself, but I assume we could solve this again via the sandwich / postbase operation. I not sure, whether there are situations with multiple merges, where this is no longer well defined (there are cases where the LCA is not unique and I need to think in more detail about those to make a definitive statement). Queries for changes between parallel branches could be needed in several use cases, e.g.: computing merge links, showing a diff between branches, switching branches efficiently.

Do you see any other cases where we would require cancelation?

So I think the question, whether we should us a sandwich/postbase operation for rebasing local changes and whether we put the information needed for cancelation into the ChangeSets is orthogonal from each other. As you pointed out, a postbase operation might be more efficient, because it avoids the need to first detect the cancelation in the squash and then remove the parts that have canceled out again. However, we might still want to put the additional information into the ChangeSets, so that we can later obtain the same result again from the history. Additionally, I would expect this operation to be consistent with the result of explicitly computing the sandwich.

With regard to squashing within commits, I think it is important to distinguish the two cases where an operation serves as the target or the base of a rebase operation (I think we once used the terms A and B changes and it relates to the left/right distributivity I mentioned below). In addition to that we also have cases where the pure data changes are requested (history queries, catch-up, UI updates).

Squashing the Base / Right distributivity I personally think it is quite imporant to be able to use a squashed changeset as the base changeset for a rebase operation (the right operand of the ↷ operation). Without this, a rebase/merge of a branch becomes a O(n*m) operation (where n is the number of changes on the rebased branch and m is the number of changes on the target branch). Even though this argument applies to both sides, I think in many cases there will be an assymetry between n and m. Especially on documents with high activity, there might be many cases where an old branch with a few changes has to be merged into the mainline, where a large number of changes has happened in the meantime. I also think that from a users perception, it is easier to understand that merging a very large branch takes long, wheras users might not anticipate merging to be very slow, when there are many changes on the target branch. Especially the worst case scenarios, where a branch is merged after months/years could be very bad. This situation becomes even worse, when there are merge links in the graph. Without the ability to squash the changes across such a link and have invers changes cancel out, each of those links would contain all changes along the merged loop and those would accumulate when merging multiple nested branches.

Another consideration is the ability to combine partial checkouts with rebases / merges. If a squashed and filtered ChangeSet can serve as base ChangeSet for a merge/rebase, merging a change even after years is potentially a quite cheap operation. We would do a query to the server to only get the changes for the actually affected sub-trees. This could be done in logarithmic time with regard to the length of the history and linear time with regard to the size s of the change. Rebasing of a sinlge change would again only be linear in the size of the change and thus merging a single change would be complexity O(log(m) * s).

Squashing the changes to be rebased / Left distributivity In this case, the ability to squash the changes is less important. It would still be nice to have the ability to squash all basic data manipulations and be sure that rebasing them together gives the same result as rebasing them separately. On the one hand, this is a good way to be sure that the rebasing algorithm works consistently and correctly. It also would have performance advantages, if it would be possible to rebase a long branch together, instead of rebasing each change separately, but as soon as there are constraints / commands that require reapplication this would not be possible any longer (an interesting question here is, whether a conflict handler that is built based on deltas, instead of reexecution, would also work correctly, even when rebasing/merging a large number of changes together. This would not be true for every delta based conflict handler, but for those that could guarantee this property, there would be the possibility to do a merge based on two squashed changesets in a more efficient way).

Could you describe in more detail, what the frames you were referring to are needed for and to which of the scenarios above they would apply? Would they apply only for the latter of the two cases or for both? Would those be needed in addition to the temporal history, we already are planning to store in a change anyways?

I’m confused by A, B, and C: are they meant to be sequential edits (meaning that each edit was performed with knowledge of the previous one)? If so, then shouldn’t (A ○ B ○ C) be [remove(“ABCABC”), 6, remove(“ABC”)]? And if not, then aren’t edits A and B deleting the same contents (which is a conflict in PropertyDDS but would make (A ○ B ○ C) equivalent to [remove(“ABC”), 6, remove(“ABC”)] in SharedTree)?

They were supposed to be sequential edits (happening with knowledge of each other). I think I got the skips wrong, sorry. The idea was, that I have a string “ABCABCABC” and the first edit deletes the first three letters, the second edit the last three and then last one the middle three. So I guess it should have been:

A = [remove(“ABC”)] B = [3, remove(“ABC”)] C = [remove(“ABC”)] C⁻¹ = [insert(“ABC”)] B⁻¹ = [3, insert(“ABC”)] A⁻¹ =[insert(“ABC”)]

I think now we would get the correct results:

(A ○ B ○ C) ○ (C⁻¹ ○ B⁻¹ ○ A⁻¹) = [remove(“ABCABCABC”)] ○ [insert(“ABCABCABC”)] = []

((((A ○ B ○ C) ○ C⁻¹) ○ B⁻¹) ○ A⁻¹) = ((([remove(“ABCABCABC”)] ○ ([insert(“ABC”)]) ○ [3, insert(“ABC”)]) ○ [insert(“ABC”)]))

And so the resulting operations would be: ([3, remove(“ABCABC”)] ○ [3, insert(“ABC”)]) ○ [insert(“ABC”)])) = [6, remove(“ABC”)] ○ [insert(“ABC”)] = [insert(“ABC”), 6, remove(“ABC”)]

Did I get it right this time?

A lot of that information is needed for correct merge outcomes in the first place.

Where else is it needed?

Nodes are not fungible. If someone is trying to insert content after some node that represents the letter “B” then we need to make sure we don’t end up inserting next to some other node that happens to also represent the letter “B”.

Yes, that is also the reason why we have the ambiguity in my counter example and why it is not sufficient to choose an arbitrary solution for the cancelation. I think you are right that inserting the skipped strings alone wouldn’t be sufficient to resolve the ambiguity. You probably could build a greedily cancels edits and therefore would give the full cancelation in the example above, but it could result in situations where actually more cancelation happens than intended by the initial edits.

I am not sure, whether a strategy than the optimization I considered further below would be possible, where additional information is only included if there is a possibility of ambiguity.

Generally, including the skips was not an option I really considered in much detail so far, I mostly included it for completeness sake.

When looking at the bread slices of a sandwhich, the overlap of a change segment with its inverse does not guarantee that the two should cancel out. This is a consequence of moves. Here is a simplified scenario (to the point that it’s a little ridiculous) where this comes up

I didn’t yet go through all the scenarios for moves. If I understand your example correctly, your main point is, that you have operations which follow moves and other which don’t and then even a move that does not actually change anything would have an effect? This is not one of the cases where I would require cancelation (according to my axioms), because we never did the inverse of the move operation. A question would be, how the system is supposed to behave, if we use your example and a user (or the rebase algorithm) explicitly creates the inverse operation of the slice move. In that case, the axioms require that it would cancel out. What would be the rebase behavior you expect in that case for a commutable vs non-commutable insert?

You’re correct that in our conversations I was thinking of applying the operation and its inverse together but I didn’t mean to be prescriptive. In other words I haven’t yet given up on associativity. As mentioned higher I am currently focusing on the “add additional info” approach. Right now the additional information consist of a sequence (i.e., commit) number (which is needed for tombstones in general) and an op ID (which I think is what you’re proposing bellow). Going forward I expect we’ll need something more complex to deal with the fact that an edit may experience multiple merges (e.g., merge branch C into branch B, then merge branch B into branch A).

Yes, that sounds very similar to my suggestion. Do you include this only for removes or for all operations?

This sounds interesting but I’m having a hard time following. A concrete example would probably help. Is this predicated on looking at the contents being inserted and deleted? How would that work for move operations? I guess we could make the case that this optimization would only work for insert/remove which is still better than nothing.

The idea here is, that if there is no ambiguity between a remove and an insert, we could potentially avoid inserting the operation IDs. Let’s assume we have a remove of a single “ABC”: [remove(“ABC”)] and no other removes at the same position. In that case, there are no possible ambiguities with regard to removes (we would have to check that there are no repeating sub sequences when we put the remove operation without an operation ID into the squashed ChangeSet). If we now get an insert(ABC), which specifies a previous operationID, we know that there must have been a remove of this string at that position and it must be the only possible position we find (assuming the operations and their reverses are applied in the correct order and directly adjacent to each other, if this is not the case the insert would have to be rebased with respect to intermediate changes and this might potentially require the addition of further scaffolding to resolve ambiguities). I am not sure, whether all this effort is warranted, just to avoid operationIDs and we also would need to analyse in more detail whether it would capture all possible cases.

So in the case you called “Effective cancelation”, there wouldn’t be any cancelation according to my definition (since the inverse change is removed in the rebased ChangeSet and thus the operations from the inverse change just remains in the ChangeSet).

Let me see if I can restate that: In a scenario where a local branch [U, V] is being merged onto a main branch [A], the change V' will not contain any inverse changes that would cancel with V (since V was never applied to the main branch in the first place).

That makes sense to me, but what of the need to produce the appropriate (minimal) delta for the local UI? That delta may need to undo the changes made by U and/or V. Do you not think of this as cancellation as well?

I agree that these two cases could be solved by having a specialized sandwich rebase+apply operation (how did you call this postbase?), that internally detects the cases where cancelation (according to my definition, “silent cancelation” to yours) would occur and wouldn’t include any change in the resulting ChangeSet.

It does sound like you're describing what I've been calling "postbase" but I'm surprised you're referring to is as "rebase+apply". I think of postbase as being analogous to rebase with the key difference that it allows the application of edits to occur in the opposite order: If A and U are concurrent edits, and A is deemed to be applied first, then rebasing U over A will produce U' to be applied after A, while postbasing A over U will produce A' to be applied after U. In both cases the outcomes must be equivalent.

Do you see any other cases where we would require cancelation?

Not at the moment. Thank you for pointing those out.

So I think the question, whether we should us a sandwich/postbase operation for rebasing local changes and whether we put the information needed for cancelation into the ChangeSets is orthogonal from each other. As you pointed out, a postbase operation might be more efficient, because it avoids the need to first detect the cancelation in the squash and then remove the parts that have canceled out again. However, we might still want to put the additional information into the ChangeSets, so that we can later obtain the same result again from the history. Additionally, I would expect this operation to be consistent with the result of explicitly computing the sandwich.

Checking my understanding: would it then be correct to say that the advantages offered by postbase (as opposed to the explicit sandwich) are purely an optimization for scenarios where the merge logic is aware of the possibility of changes cancelling out with their inverse, and that we need to support such a cancellation in the general case (i.e., when the merge logic is not aware of such things) in general?

With regard to squashing within commits, I think it is important to distinguish the two cases where an operation serves as the target or the base of a rebase operation.

Agreed. I've been picturing this as having two squash functions: one for squashing changes in a single change frame (or constraints in a single constraint frame), and one for squashing frames together. The former can be used all the time, while the latter can only be used if we don't need the changes to be rebased or be rebased over.

Squashing the Base / Right distributivity I personally think it is quite important to be able to use a squashed changeset as the base changeset for a rebase operation (the right operand of the ↷ operation).

I agree it would be great to be able to squash those changes before rebasing over them. One challenge to doing so is that rebased edits need to be able to include tombstones that specify precisely which of the base edits the content was removed by. So, if I'm rebasing edit U over edits [A, B, C], then U' needs to be able to say, for each of its tombstones, which of A, B, or C, deleted the content that the given tombstone represents. This is needed because some other edit V may be rebased over U' in which case we will need to be able to compare the tombstones in V and in U'.

One approach we could look into is to compute a list of the base changes which (for the sake of intention-preservation) cannot be squashed prior to the rebase. This would let us squash all the other changes before, after, and between such changes. So for example if the base changes were [A, B, C, D, E, F, G, H, I] C and G needed to be preserved as being independent, then we'd end up rebasing over [A○B, C, D○E○F, G, H○I]. The hope is that in practice we'd end up squashing most changes.

Another approach would be to preserve enough provenance information in the squash of [A, B, C] to ensure the rebase algorithm would output the same U' as it would have if rebased over A, then B, then C. In practice I think this means each removal segment in the squash of [A, B, C] would need to be tagged with an identifier that uniquely identifies the original edit (i.e., either A, B, or C) it was made in. It would also mean that contiguous deletions that originated from different edits could not be merged together (which makes squashing slightly less advantageous).

Is the latter approach what you had in mind?

Squashing the changes to be rebased / Left distributivity [...] as soon as there are constraints / commands that require reapplication this would not be possible any longer

Agreed. Note that even in situations where no explicit constraint is expressed, there is still the possibility of a conflict due to implicit constraints (e.g., "insert in trait foo of node X" will conflict with concurrent edits that either directly or indirectly delete node X"). For this reason, we typically cannot squash across transaction boundaries.

an interesting question here is, whether a conflict handler that is built based on deltas, instead of reexecution, would also work correctly, even when rebasing/merging a large number of changes together.

I think that would only be possible in a system that did not offer atomicity guarantees for its transactions (at which point the term "transaction" would be misleading). Perhaps you're thinking of some specific scenarios where transactions are never made to fail.

Could you describe in more detail, what the frames you were referring to are needed for

Quick disclaimer: the constraint design has not been explored enough yet so anything I say about it may need to be revisited once we have a more concrete picture of it.

The frames are needed to segregate changes from constraints when edits are initially produced by a client. The reason we need to segregate changes from constraints is that constraints are expressed relative to the state that the document is in at the time. This means a constraint may need to be evaluated before any changes made by the transaction, after all the changes made by the transaction, or somewhere in between. For example, a transaction that makes two changes may include some constraints on the state before the first change, between the two changes, and after the second change. This presents two challenges:

We need to be able to describe, for a given constraint, when it should apply relative to the changes in the transaction.
We need to be able to represent the constraint for the state/time for which it applies to.

The frames give us a simple and easy solution by introducing a temporal ordering between changes and constraints. That said, if we could find a way to convert the constraints that apply to the state after some (or all) of the transaction changes have been applied, into equivalent constraints that can be applied at the start of the transaction instead, then we'd be able to get rid of frames and we'd be able to squash all the changes within a transaction together. We'd essentially be left with two frames for each transaction: first a potentially empty constraint frame, followed by a change frame. Whether such a conversion is possible and practical is not something I've thought much about, but one thing is sure: our desire to move away from constraints about the state of the document (and towards constraints about concurrent changes), is likely to make it more feasible.

and to which of the scenarios above they would apply? Would they apply only for the latter of the two cases or for both?

The segregation and interleaving of constraint and changes frames is only needed in the latter of the two cases (changes being rebased) because that's the only time where we need to (re)evaluate constraints. That said, I think that frames may end up also being relevant in the other case because tombstones need to uniquely reference changes of prior edits and they might need to refer to frames to do so. This is an area of the design that is in flux and can be thought of as an implementation detail to cope with the existence of frames rather than a motivating factor. We may end up using a different way for tombstones to refer such prior changes (more on that below).

Would those be needed in addition to the temporal history, we already are planning to store in a change anyways?

They indeed would be because peers need to be able to assess whether a constraint might have been violated, and we want to be able to do so without having to pull data from the history server and either converting the temporal information into spatial information or reifying the tree to apply the temporal changes for that purpose.

A = [remove(“ABC”)] B = [3, remove(“ABC”)] C = [remove(“ABC”)] C⁻¹ = [insert(“ABC”)] B⁻¹ = [3, insert(“ABC”)] A⁻¹ =[insert(“ABC”)]

I think now we would get the correct results:

(A ○ B ○ C) ○ (C⁻¹ ○ B⁻¹ ○ A⁻¹) = [remove(“ABCABCABC”)] ○ [insert(“ABCABCABC”)] = []

((((A ○ B ○ C) ○ C⁻¹) ○ B⁻¹) ○ A⁻¹) = ((([remove(“ABCABCABC”)] ○ ([insert(“ABC”)]) ○ [3, insert(“ABC”)]) ○ [insert(“ABC”)]))

And so the resulting operations would be: ([3, remove(“ABCABC”)] ○ [3, insert(“ABC”)]) ○ [insert(“ABC”)])) = [6, remove(“ABC”)] ○ [insert(“ABC”)] = [insert(“ABC”), 6, remove(“ABC”)]

Did I get it right this time?

Yes, that makes sense to me. I think I also made some mistake in my comment on that portion. It's really easy to get something wrong with this stuff.

A lot of that information is needed for correct merge outcomes in the first place.

Where else is it needed?

The information I was referring to is the data that normal changes and tombstones need to contain:

Slice operations and move operations need to contain an op ID that uniquely identifies the operation within the commit.
Tombstones need to contain a commit number to identify the commit that deleted the data, and possibly a frame number.
Tombstones for slice and move operations need both.

Storing this information in the inverse edits is of course an additional cost (we're not amortizing anything here) but it's not a new kind of cost we weren't considering already.

I didn’t yet go through all the scenarios for moves. If I understand your example correctly, your main point is, that you have operations which follow moves and other which don’t and then even a move that does not actually change anything would have an effect?

The part about the move not actually changing anything is a bit of a distraction here. The main point is that the location of a change segment in the trait, given the change-representation scheme I portrayed, is not enough to judge whether a change segment ought to cancel out with its inverse. I've since tweaked my change-representation scheme a little so that such a case doesn't arise anymore (because a change and its inverse will never match in position if they shouldn't cancel out) but it's a good pitfall to keep in mind.

If [...] a user (or the rebase algorithm) explicitly creates the inverse operation of the slice move. In that case, the axioms require that it would cancel out. What would be the rebase behavior you expect in that case for a commutable vs non-commutable insert?

I would expect that a commutable insert would end up in the location where it was originally inserted. A non-commutable insert would be unaffected by the slice-move and therefore unaffected by its inverse. For example, in a trait foo containing nodes [A, B, C] imagine the following operations took place:

E1: (ref=0) slice-move [A, B, C] from trait foo to trait bar
E2: (ref=1) undo E1
E3: (ref=0) insert (commutative) W after A in trait foo
E4: (ref=0) insert (non-commutative) X after B in trait foo
E5: (ref=1) insert (commutative) Y after A in trait bar
E6: (ref=1) insert (non-commutative) Z after B in trait bar The final state should be foo=[A, W, B, X, C] and bar=[Y, Z].

Right now the additional information consist of a sequence (i.e., commit) number [...] and an op ID [...].

Do you include this only for removes or for all operations?

This is what I currently have but I'll wager it'll change before long.

Operation	Inverse Operation	Inverse Includes Seq/Frame#	Inverse Includes Op ID
SetValue	RevertValue	Yes	No
RevertValue	RevertValue	Yes	No
Insert	Delete	No	No
Delete	Revive	Yes	For slice only
Revive	Delete	No	No
[MoveOut MoveIn]	[Return MoveOut]	Yes	Yes
[Return MoveOut]	[MoveOut Return]	Yes	Yes
[MoveOut Return]	[Return MoveOut]	Yes	Yes

The "revive" and "return" segments are special versions of insert and move-in (respectively) that do not introduce new anchor points. So, if some user is inserting X between two deleted nodes A and B, the inversion of the deletion will leave X between A and B as opposed to either before or after A and B.

You may be wondering how I can get away without assigning an op ID to the delete operations. The idea here is that since I'm already unable to squash separate frames/transactions together, the revive segment can uniquely identify the delete operation that it relates to solely by referring to its Seq/Frame# and relying on positional matching. This is not something I'm particularly attached to, and I wouldn't be surprised if it stopped working or stopped being worth the effort going forward.

The idea here is, that if there is no ambiguity between a remove and an insert, we could potentially avoid inserting the operation IDs [...]

Thanks for explaining that idea further. Anything that requires doing content comparison makes me a little uneasy, but I can indeed see it working for some specific cases. I don't think it's worth the mental effort to delve deeper into it while the broader system is so much in flux but it's definitely something to revisit once things settle down as it's possible that the "specific cases" turn out to be very common.

One thing that would be helpful to me, would be a list of scenarios/capabilities that we know will solely rely on the temporal format (as opposed to the spatial one). This list may not be 100% settled as of yet but even writing down what we know for sure and what we're tentatively positing would be a start.

My approach so far has been "Make the spatial format do everything. The temporal format will just provide a better outcome in some cases." but being able to reduce that scope would be really helpful. For example, we could say "spatial changes are only for the collab window in live collab scenarios, and for computing minimal delta to catch-up clients' state". I don't expect we will say that. I'm just highlighting the sort of statement that I'm looking for.

Is there an open GH issue for when to use one format vs. the other? CC @taylorsw04 @DLehenbauer @PaulKwMicrosoft

Two more previous mails from the thread, which give a bit more background on the symbols and terminology used above:

From: @ruiterr Sent: February 21st, 2022

I think it is very useful to think about the whole problem space as an “algebraic structure”. We have been using those terms already earlyier, but in the following I would like to structure all of this more cleanly.

Generally, we have three basic operations on ChangeSets, for which I am going to use the following symbols:

Squash	A ○ B
Inverse	A⁻¹
Rebase	A ↷ B

For the first two operations, we can now define a set of Axioms, which together build the structure of a group (or more specifically a groupoid, if we assume that some squashes are not allowed, e.g. because of type mismatches):

Identity Element	A ○ ∅ = A = ∅ ○ A
Inverse Element	∀A: ∃A⁻¹: A ○ A⁻¹ = ∅ = A⁻¹ ○ A
Associativity	∀A,B,C: A ○ (B ○ C) = (A ○ B) ○ C

In addition to that, we have additional axioms for the rebase, behaviour with respect to the identy and “distributivity of the rebase” (for lack of a better term, I used this, even though it does not exactly match distributivity in a ring, so maybe we need a different term):

Rebase with identity	∀A: (A ↷ ∅) = A
Rebase of identity	∀A: (∅ ↷ A) = ∅
Left Distributivity	∀A,B,C: (A ○ B) ↷ C = (A ↷ C) ○ (B ↷ (A⁻¹ ○ C ○ (A ↷ C) ))
Right Distributivity	∀A,B,C: A ↷ (B ○ C) = (A ↷ B) ↷ C

We can derive a lot of useful properties that follow from these basic axioms:

The associativity of the squash essentially proves that the squash is correct. It implies that we can squash arbitrary subsets of the change graph and apply them together (e.g. for catch up). It also implies that we can perform the squashes in arbitrary order, which enables the hierarchical history, where we create shortcuts in the graph.
From the Left Distributivity follows, that rebasing dependent operations gives the correct result (i.e. that the sandwich rebase is correct). E.g. assume A inserts an object and B changes it. In that case (A ○ B) would directly insert the modified object. The Left Distributivity now implies that rebasing the combined operation is equivalent to separately rebasing A and B, i.e. the rebased A and B have to jointly create the same modified object as rebasing the modified object directly.
Associativity and Distributivity together also imply that the rebase and squash is correct. We can show that those give the correct result for the basic cases (e.g. a single insert or rebasing to primitive operations with each other) and from those will follow that the result is also correct for all more complex combinations of the basic operations.
Left Distributivity also implies that we can rebase each change individually or also as a group. This would help to improve the performance of large rebases. On the other hand, it means that we get the correct result, if we have to rebase individually, e.g. because we have to do conflict resolution for some of the intermediate changes.
The Right Distributivity implies that we can rebase a branch in an arbitrary order (e.g. commit by commit or also in bigger jumps). This is useful to ensure convergenve of the local branch, if we repeatedly rebase it with the remote branch. It also shows that users that repeatedfly rebase a feature branch will get the same result, independent of the exact points where they do the rebase.
Rebase of identity, the Inverse element and Left Distributivity together imply that undos are preserved under rebase: ∅ = (∅ ↷ B) = (A ○ A⁻¹) ↷ C = (A ↷ B) ○ (A⁻¹ ↷ (A⁻¹ ○ C ○ (A ↷ B))) => (A ↷ B) = (A⁻¹ ↷ (A⁻¹ ○ C ○ (A ↷ B))) ⁻¹
Rebase with identity, Inverse Element and Right distributivity imply that the “exclude operation” is well posed: A = A ↷ ∅ = A ↷ (B⁻¹ ○ B) = (A ↷ B⁻¹) ↷ B. This implies that an element that excludes the effect of B exists, and the exclude and include cancel out with each other, giving back the original change.
The Associativity of squashes and Right distributivity implies, that rebases will give the same result, if they are done with respect to different paths through the change graph (this is relevant, once we introduce merges, these will create changes with two parents and therefore, there will be multiple paths through the commit graph that connect two states).

In my mind, the “ideal” situation would be that we come up with a ChangeSet format and set of operations that preserve all these axioms. In that case, all operations we need, could be implemented in terms of those basic operations and we could be confident that they give correct results. However, it might very well be the case that we have to make compromises with respect to those properties for efficiency reasons. Specifically, it could be that fully preserving Associativity of squash and Distributivity of Rebase will force us to introduce too much scaffolding information and thus would become too inefficent. In that case, we could either sacrifice the full associativity of the squash (e.g. only allowing changes to be squashed in a specific order to reduce the amount of scaffolding we need to keep) or introducing more specialized rebasing operations that potentially are more efficient, because they can keep the scaffolding only in memory (as discussed):

Three Way Rebase (aka sheparded rebase)
3WayRebase(A, [B0,B1, B2,…], [C0,C1, C2,…]) = A ↷ ( (B0 ○ B1 ○ B2 ○ …) ⁻¹ (C0 ○ C1 ○ C2 ○ …) ( (B0 ○ B1 ○ B2 ○ …) ↷ (C0 ○ C1 ○ C2 ○ …)) )
Sandwich Rebase
Sandwich(A, B) = A⁻¹ ○ B ○ (A ↷ B)

In our past experience, we often encountered bugs that resulted from our ChangeSet operations not adhering to these basic axioms. Sometimes this happened very late in the development process (I still discovered new ones when I integrated our code into the PropertyDDs) and those were quite hard to debug and fix. Given the fact, that we intend to introduce a lot more complexity into the ChangeSets I think that we need to analyse this very carefully. In my opinion, it would make a lot of sense, that we first try to define the ChangeSet and operations in such a way that all those axioms are fulfilled and then deviate only if we have to for performance reasons and exactly understand which properties we will be sacrificing by not fully complying with the axioms.

I am wondering, whether the best approach for this would be to actually write a formal proof for those properties in a system for machine proofing. That would help us tremendously in discovering cases, where our definition of the ChangeSets subtly violates these properties. I recently did some experiments writing such a proof for squashing of simple operations and I think this would be possible (however it would be some effort to do this).

From: @ruiterr Sent: February 22nd, 2022

In my previous mail, I glossed a bit over some details in my formalism.

There are two separate sets of entities, the ChangeSets ℂ and the States 𝕊, which in the actual system are distinct entities (in PropertyDDS the ChangeSets are JSON structures we internally keep in the DDS and the tree is the datastructure we expose to the user, using a different representation via JavaScript property classes). You can create a bijective mapping between the trees and the subset of the ChangeSet which only contain inserts (we call those normalized changesets ℕ ⊂ ℂ). This mapping is given by applying the normalized changeset to an empty Tree to crete the resulting final datastucture (we call the function doing this denormalize: ℕ → 𝕊 : x ↦ update(x, ∅), where in this case ∅ ∈ 𝕊 denotes the empty Tree; maybe we should use differente symbols than the empty set symbol to avoid confusions for the empty trees and ChangeSets).

We now have a group action (ℂ, 𝕊, update), where the ChangeSets act on the Trees via the update function (or in PropertyDDS this function is called applyChangeSet). The axioms of the group on ℂ are preserved in this group action (there are a few technicalities here, because ℕ is not closed under operations of the group, e.g. you can remove a non existing element and thus create a ChangeSet that is no longer normalized, similarly this would trigger an exception when applied to the tree).

I think for most purposes, this additional level of complexity is not necessary. Since we have a bijective mapping between Trees and normalized ChangeSets it suffices to show the axiom on the ChangeSets itself and then specify that the denormalize and update functions correctly implement the mapping.

We cannot talk about cancellation and squashing without postbase or more general forms of operation transform.

Start with string "XY".

local edit a: insert "a" at position 1

Local result is "XaY".

Remote op:

edit z: insert "Z" at position 2.

The sandwich approach relies on undoing a, doing z, and then redoing a rebased version of a, which in this case remains a. Given the sequence of operations:

a ○ a^-1 ○ z ○ a

If we want to simplify this sequence of changes using silent cancellation, then we need an operation z' that can be applied after a. This is post basing z over a. Without the ability to transform z then there is no way to simplify this collection of changes.

We should put the notion of changeset aside here. The core design question is how do we expect views consuming operations to handle remote operations applied over local operations. If we assume that it is out of the question to show the view deleting "a", inserting "z", then reinserting "a", we have two choices:

Flush the view after inserting "a" again. This is unsatisfactory in many ways:
- It is inefficient because we are doing more operations. Applying changes to the view is likely expensive.
- It introduces operations into the op stream that are really internal and orthogonal to the users intent. Reverting the state to apply z is an implementation detail.
Find a transformation for z that can be applied after a.

I think we have been assuming 2., relying on the implementation of squash to implicitly compute this transformation.

Another thing to call out is that this is not limited to local / remote operations. This also arises when a client must handle incoming remote operations, where the dependent edits for an operation are interleaved with dependent operations.

Given the op stream: [A, Z, B] where A and B are dependent and concurrent with Z. Then we have the following ops that can be applied after each other:

[A, Z', B'], where Z' = Z ↷ A, and B' = B ↷ Z'.

Given an incoming operation C, which is dependent on A and B, and concurrent with Z, then the corresponding edit is computed by:

C ↷ (B'^-1 ○ Z' ○ B')

We aim to make this efficient by squashing (B'^-1 ○ Z' ○ B') and hopefully simplifying the 'sandwich' by reducing or cancelling inverse operations. OT systems aim to achieve exactly the same result, but instead through transposition. Given:

Given [A, Z', B'], instead, transpose Z' and B' to get [A, B, Z''], and then rebase C over Z''. Z'' is the result of postbasing Z' over B'.

I think we should first get crystal clear what the system looks like for a model of atomic operations applied in a sequence. Do we allow a remote operation to be applied to the state after a local one? Right now I think changesets are trying to do two things:

Compress redundant information like two in-order updates to the same key
Somehow make a system that does not allow remote operations applied after local ones appear like it does, by cancelling sandwiches in various ways.

Rather than trying to solve 1 and 2 at the same time, I think we should solve them independently first and then try and consolidate. I believes @yann-achard-MS's proposal of postbasing + sequences of frames moves towards this.

We should put the notion of changeset aside here. The core design question is how do we expect views consuming operations to handle remote operations applied over local operations.

It is indeed a key question, but what Roland pointed out, assuming I followed correctly, is that we are confronted with the need/desire to cancel-out changes with their inverse even outside of this remove-over-local problem. If we decide to make good on this desire outside of the remove-over-local problem, then we can also leverage the solution to that for the remove-over-local problem.

Do we allow a remote operation to be applied to the state after a local one?

I'm curious where you draw the line between "yes" and "no". For me, a "no" answer would require us to explicitly roll back the local edits (then apply the remote edits, then the (rebased) local ones). This is what SharedTreeV1 does under the hood (the UI isn't aware of the roll-back), but we've established through the definition of ISharedTree that we weren't going to do that in SharedTreeV2.

So by that definition, even the explicit sandwich approach allows remote operations to be applied to the state after local ones. The difference between the explicit sandwich approach and postbase/transpose is in how we formalize (and implement it) all. I find postbase more appealing because I didn't want to put change-cancelling smarts in squash and because of performance opportunities that postbase offers, but Roland's point that we'll want those change-cancelling smarts in squash either way really makes me thing postbase is just icing on the cake (and a helpful formalization).

Rather than trying to solve 1 and 2 at the same time, I think we should solve them independently first and then try and consolidate. I believes @yann-achard-MS's proposal of postbasing + sequences of frames moves towards this.

That was indeed the intention.

That makes sense to me, but what of the need to produce the appropriate (minimal) delta for the local UI? That delta may need to undo the changes made by U and/or V. Do you not think of this as cancellation as well?

I used the term "cancelation" only with respect to the squashing of two ChangeSets. Cancelation happens if two operations in the changesets are mutually removed in the final changeset (either insert + remove of the same value/range or remove and insert, where insert inserts the same data as has been removed previously). An alternative view is, that squashing an operation and its inverse is supposed to be an empty operation (and not one doing an insert / remove).

So this does not refer to the effect in the application, where potentially some local change is removed, if it did not appear after rebasing. I agree that there is a need to compute the correct minimal changeset for UI updates, but in that specific case we wouldn't have cancelation since the inverse change is no longer present in V' and thus if you squash the two, the remove from V⁻¹ will remain in the squash ChangeSet.

My defintion of cancelation does not refer to rebases at all. The example I gave only had a few operations and their inverses which are squashed with each other. The inversion axiom expects this to give an empty ChangeSet, but depending on the order of the squashes, this cannot be achieved without additional information being put into the ChangeSet. This does relate to rebases, because if this cancelation doesn't happen, the base changeset for the rebase will be incorrect and thus the rebase result, but that is here only considered as a consequence of this more fundamental problem.

It does sound like you're describing what I've been calling "postbase" but I'm surprised you're referring to is as "rebase+apply". I think of postbase as being analogous to rebase with the key difference that it allows the application of edits to occur in the opposite order: If A and U are concurrent edits, and A is deemed to be applied first, then rebasing U over A will produce U' to be applied after A, while postbasing A over U will produce A' to be applied after U. In both cases the outcomes must be equivalent.

I think the two operations are essentially equivalent. Lets look at what you described:

A ○ U' = U ○ A'

Now U' is (U ↷ A) giving us:

A ○ (U ↷ A) = U ○ A'

If we now squash with U⁻¹ from the left on both sides, we have:

U⁻¹ ○ A ○ (U ↷ A) = U⁻¹ ○ U ○ A' = A'

So what you call postbase computes the same ChangeSet as the sandwich does (at least with respect to the state transition it creates, if the cancelation doesn't work, it might be a non-minimal changeset).

Checking my understanding: would it then be correct to say that the advantages offered by postbase (as opposed to the explicit sandwich) are purely an optimization for scenarios where the merge logic is aware of the possibility of changes cancelling out with their inverse, and that we need to support such a cancellation in the general case (i.e., when the merge logic is not aware of such things) in general?

Yes exactly that is my point. If we consider the case where a user undoes a change (computing its inverse) and sends the undo operation (just on the same branch without any conflicts), we have never done any rebasing. Still, we will have a change and its inverse being squashed with each other. If we expect that in such a case, history queries detect the cancelation and report that no change has been done, we will need to support cancelation also in the basic squash and inverse operations and not only in the rebase case. And in that case, a special postbase operation would be an optimization for this more generic logic.

So, if I'm rebasing edit U over edits [A, B, C], then U' needs to be able to say, for each of its tombstones, which of A, B, or C, deleted the content that the given tombstone represents.

That sounds like the same information I think is needed for the cancelation during squashing to work (a unique ID for removes).

In practice I think this means each removal segment in the squash of [A, B, C] would need to be tagged with an identifier that uniquely identifies the original edit (i.e., either A, B, or C) it was made in. It would also mean that contiguous deletions that originated from different edits could not be merged together (which makes squashing slightly less advantageous). Is the latter approach what you had in mind?

Yes, thats what I had in mind: unique IDs per remove segment, which we would have to preserve during squashes. So there would be a single changeset, but a remove would have to remain splitted into adjacent segments in this changeset.

This is needed because some other edit V may be rebased over U' in which case we will need to be able to compare the tombstones in V and in U'.

Could you describe in more detail where this is needed during the rebase / what the comparing of tombstones refers to?

Note that even in situations where no explicit constraint is expressed, there is still the possibility of a conflict due to implicit constraints (e.g., "insert in trait foo of node X" will conflict with concurrent edits that either directly or indirectly delete node X"). For this reason, we typically cannot squash across transaction boundaries.

I am not sure, whether this is still true, if you use the default conflict resolutions. It could be that the default conflict resolution (in this case removing the insert) would preserve the left distributivity. In that case, it would work either, if it is done after the individual conflicting change, or after applying the squashed conflict. Even more generally, an application specific conflict handler could be left distributive. This is certainly not true for every conflict handler, but if a handler does have this property, merges could be computed more efficiently.

I think that would only be possible in a system that did not offer atomicity guarantees for its transactions (at which point the term "transaction" would be misleading). Perhaps you're thinking of some specific scenarios where transactions are never made to fail.

Yes, this applies to a case where transactions do not fail. Instead a conflict handler rewrites the conflicting operation into a valid one. In such a case, it would be possible that a conflict handler does have the left distributivity property (i.e. rewrite too individual changes separately gives the same result as rewriting the combined change).

A simple example for such a change handler might be one, which just recomputes a dependent value. If this recomputation just depends on a single input value, it would give the correct result, no mater whether it is done after each individual opreation or only for a squashed sequence of operations.

The segregation and interleaving of constraint and changes frames is only needed in the latter of the two cases (changes being rebased) because that's the only time where we need to (re)evaluate constraints.

Thanks for confirming that. That's what I hoped / am striving for.

They indeed would be because peers need to be able to assess whether a constraint might have been violated, and we want to be able to do so without having to pull data from the history server and either converting the temporal information into spatial information or reifying the tree to apply the temporal changes for that purpose.

So this is more a question of the format the history is represented in and whether and to whom it is distributed than a fundamental difference? What you are describing as frames, sounds actually exactly like the representation I proposed a while back for the temporal history: an array (or tree) containing individual operations, each having operation meta data (command type, parameters), a ChangeSet describing its effect and a set of constraints.

That said, if we could find a way to convert the constraints that apply to the state after some (or all) of the transaction changes have been applied, into equivalent constraints that can be applied at the start of the transaction instead, then we'd be able to get rid of frames and we'd be able to squash all the changes within a transaction together

Yes that is an interesting idea. It might be possible in many cases to optimize the constraints in such a way.

That said, I think that frames may end up also being relevant in the other case because tombstones need to uniquely reference changes of prior edits and they might need to refer to frames to do so. This is an area of the design that is in flux and can be thought of as an implementation detail to cope with the existence of frames rather than a motivating factor.

I was hoping that this necessity could be avoided by having IDs on the individual remove segments, not on whole frames, but as you mention, we probably need to analyze this in more detail.

I would expect that a commutable insert would end up in the location where it was originally inserted. A non-commutable insert would be unaffected by the slice-move and therefore unaffected by its inverse.

So I think this would be compatible with the moves canceling out, with regard to the expected outcome, but the case E4 is a bit challenging if you consider rebasing it individually across both changes, because it refers to a non existing position within the moved slice. I assume we will need some scaffolding to preserve that lost information, but this should be similar to the cases we discussed earlier.

You may be wondering how I can get away without assigning an op ID to the delete operations. The idea here is that since I'm already unable to squash separate frames/transactions together, the revive segment can uniquely identify the delete operation that it relates to solely by referring to its Seq/Frame# and relying on positional matching. This is not something I'm particularly attached to, and I wouldn't be surprised if it stopped working or stopped being worth the effort going forward.

This would work for changes in the changeset that is rebased, but if we intend to use "frameless" changesets for the base changes we would probably need to insert the IDs there.

@jack-williams, @DLehenbauer and I had a conversation this morning. Here is my attempt at capturing the takeaways.

We tried to categorize the different solutions to the challenge of rebasing dependent edits: `eff`: `editFromFames(frames: Frame[]): Edit` returns a single edit composed of the given frames. `ffe`: `framesFromEdit(edit: Edit): Frame[]` returns all the frames of the given edit. `(eff(ffe(x))) <=> x`	#	Solution Name
0	Explicit Edit Sandwich	`[a⁻¹, z, a']`
1	Explicit Frame Sandwich	`[eff([...ffe(a⁻¹), ...ffe(z), ...ffe(a')])]`
2.a	Smart Squash	`[squash([a⁻¹, z, a'])]`
2.b	Postbase	`[z]` if `z` and `a` commute, Explicit Frame Sandwich otherwise
3	Squashed Postbase	`[z]` if `z` and `a` commute, `[squash(Explicit Frame Sandwich)]` otherwise

2.a has the advantage that it leads to a more compressed changeset. 2.b has the advantage that it is more efficient in cases where edits commute.

gets the best of both worlds.

We discussed what our roadmap should be between the above options. Our current view is as follows:

Options 0 and 1 are trivial and can be used for testing purposes.
We think tackling 2.b is the right thing to focus on right now because it a) gives us a formalism for describing how the rebasing of dependent edits work (as opposed to 2.a which leaves it as implementation detail) b) leads to a valuable performance win which we'll want long term
Once we have 2.b, we'll invest time in making squash able to further compress changes (when rebasing dependent edits but also in general, such as when a change and its corresponding undo should cancel out).

In response to my 10am post yesterday, we established to following:

Constrains being violated is the trigger point for resorting to approaches that leverage high-level edit information.
We need some data-only (i.e., spatial-format-based) answers to all scenarios, even if we'll ultimately want to nudge people toward the goodness of high-level edit in the long term.
The data-only solutions are the more pressing ones given customer demands.
The fact that we do have a long term plan of leveraging high-level edits means there is a slight incentive to delay data-only improvements that might be superseded by high-level edit solutions in the long term.

Applying (as a case study) the above points to the scenario of rebasing a branch foo onto another branch bar:

We acknowledged that being able to simply run the high-level edits of foo on the tip of bar (provided the rebase happened on a client that had the relevant application code) would yield the best merge outcomes and could prove to be more efficient than rebasing foo's edits over bar's edits in cases where bar has many (non-compressible) edits.
We nevertheless need a data-only solution in place when re-running high-level edits isn't possible/desirable.
We need the data-only solution first.

Yes exactly that is my point. If we consider the case where a user undoes a change (computing its inverse) and sends the undo operation (just on the same branch without any conflicts), we have never done any rebasing. Still, we will have a change and its inverse being squashed with each other. If we expect that in such a case, history queries detect the cancelation and report that no change has been done, we will need to support cancelation also in the basic squash and inverse operations and not only in the rebase case. And in that case, a special postbase operation would be an optimization for this more generic logic.

Just to continue on this. I agree that this something to aim for. My immediate thought is how do we actually implement this in reliable and principled way? One sketch is via transposition of operations:

Start string: "abc"

Changes:

insert 0 "01"
insert 3 "3"
insert 1 "x"
delete 4 1 (tracked as undo op#2)

Start squashing by pushing operations "back" by transposing them.

(1) Transpose last two operations:

insert 0 "01"
insert 3 "3"
delete 3 1 (tracked as undo op#2)
insert 1 "x"

(2) Cannot transpose middle two operations because of causal dependency, however adjacent do and undo can be cancelled.

insert 0 "01"
insert 1 "x"

(3) Cannot transpose resulting two operations because second is relative to first. Final changeset.

The underlying idea is that the transposition of two operations allows them to be re-ordered whilst preserving the same effect. We use transposition to permute operations and find simplifications. Transposition over a list of ops is going to give bad complexity, so instead construct a data structure over the op list that allows search and transposition in logarithmic time. For sequences we know this as a merge tree, for SharedTree this is @yann-achard-MS's changeset format.

The concept of transposition also works for the sandwich rebase. Take the example in my earlier comment:

Local edit: insert "a" at position 1. Incoming remote edit: insert "Z" at position 2.

Sandwich rebase:

insert "a" at position 1
delete "a" at position 1 (undo op#1)
insert "Z" at position 2
insert "a" at position 1 (undo op#2 aka redo op#1)

We do not cancel op#1 and op#2 because we have already applied op#1 to the local model. Instead, start by transposing the last op back.

insert "a" at position 1
delete "a" at position 1 (undo op#1)
insert "a" at position 1 (undo op#2 aka redo op#1)
insert "Z" at position 3

Adjacent undo/redo pair in middle. Cancel.

insert "a" at position 1
insert "Z" at position 3

We have already applied first op, so our changeset is the adjusted incoming op (insert "Z" at position 3). Note that the transposition we did is a postbase operation.

IMO, transposition is key to understanding and constructing squashes, and postbase is part of defining how to transpose, rather than an optimization. Changesets are a structure for efficiently transposing. This is how I see the problem, but it is only my perspective.

@jack-williams

IMO, transposition is key to understanding and constructing squashes [....]. Changesets are a structure for efficiently transposing. This is how I see the problem, but it is only my perspective.

Yes, there is a strong connection between transposition and the creation of ChangeSets / squashing. When squashing changes, we first transpose / rebase backwards the second change, so that it is relative to the base state before the first change (in contrast to being relative to the result of the first changeset). In the next step, we consolidate all operations that affect the same data (i.e. canceling insert/remove combinations and removing duplicated modifies). The result of this is a set of non-conflicting operations that all apply to the same base state. These operations are now order-independent (since all apply to the same state and no longer have any interactions among themselves). So the term ChangeSet is quite accurate here, in the sense that it is no longer an ordered array of operations, but instead an unordered set of operations (however, this might be a coincidence, I didn't initially choose the term).

So from this point of view, it is a structure for efficiently transposing, because there is no longer any order in the CS and no need to perform any transpositions for operations within a ChangeSet. Squashing two ChangeSets efficiently resolves the transposition problem, because it is no longer necesary to transpose all atomic operations within both changesets with respect to each other, but instead possible to directly merge two already order free operations sets.

and postbase is part of defining how to transpose, rather than an optimization.

I guess, it depends on what you regard as a "definition". On the one hand, you could argue that when viewed from the outside (the external contract), the definition demanding that squash produces a changeset that is equivalent to sequentially applying the two input ChangeSets defines the operation. However, if you look into the internal datastructure of the squashed ChangeSet, there are different levels of normalization that you could demand. As you pointed out, just maintaining an array of operations within the ChangeSet would suffice to fulfill the external contract, but does not really represent what I had in mind when defining the ChangeSet.

I am not sure, whether it should explictly be defined via transpositions or via some other more indirect properties. Maybe "an unordered set of atomic operations that all apply to the same base state, where no two operations modify the same data", would be a way to capture this intent?

All of this is not only about transposition, but also about filtering. By removing all dependencies between operations in the set and structuring them based on the position at which they perform changes in the data, it becomes possible to extract a subset of the changeSet, which is still a valid description of how the corresponding part of the data model is modified.

@yann-achard-MS

How would your proposed solution 2.b work, if there are multiple changes in a changeset, some of which cancel out and others don't. Would you split this changeset into multiple frames?

In our last meeting, I mentioned the importance to track the difference between insert and remove operations, versus set operations.

If you have a map and insert the key "test", and then remove it again in a separate change, the two operations cancel out:

{insert("test", first value)} ○ {remove("test")} = {}

in contrast, if you only have a set operation, which implictly creates the property (sometimes called upsert), you don't have any cancelation. Even if there is a corresponding remove operation the two don't cancel:

{upsert("test", first value)} ○ {remove("test")} = {remove("test")}

The reason for this is, that with an upsert, you don't know whether the property existed prior to the first changeset and thus you would have to keep the remove indefintely to make sure you get the correct result, if you squash with further earlier upserts. As a consequence, you will not get a proper normalized changeset when squashing and you keep unnecessary remove operations indefinetly.

So that is why I asked whether we will have explit insert/remove within the trait and not just a set. If all inserts and removes within a trait in a changeset cancel out, the whole trait would be removed from the ChangeSet?

As a related question, would applying a slice delete (from beginning to end) to a non-existing trait be an error? How is such an edit operation expressed? Is this done via the indices of the first and last element (or more specifically via a correspond skip) or by having special values for beginning and end? I think the latter case would not work, because we could never cancel such a slice delete in a squash (as we don't know, whether there are any more elements within the trait or not).

@ruiterr

I spoke with @yann-achard-MS and @DLehenbauer last week. I think we all agree that, in essence, we are all on the same page with what you outline. I think any differences in the discussion arise from talking about different levels of abstraction, as we go from the external contract of a ChangeSet and look to turn that into something concrete and implementable.

One difference that I had not considered is the base of the operations. I believe @yann-achard-MS's format assumes that within a trait (array), the operations are not from the same base, but assume the earlier op has been applied. There is some subtlety here because that format uses a counted representation rather than a positional one:

counted: [1][insert "y"][insert "x"] positional (ordered base): [insert1 "y"][insert 2 "x"] positional (same base): [insert 1 "y"][insert 1 "x"]

In the above case there is no real semantic difference between alternatives, only that some require a running adjustment to apply. Where the difference does matter, I think, is with the new operations like Revive that refer to previous operations. It seems nonsensical to have a revive operation with the same base as the operation that it is undoing. We probably hope that these operations cancel out, but we are now relying cancellation, rather than using it as optimization. I wonder if having changsets maintain causal ordering is more robust.

Anyway, just to be concrete. Here is one sketch for changeset construction.

Start with a sequence of operations in causal order, that is, each operation has the context of the prior operation. For example:

[set x "abc"][set b 42][set x "efg"][insert c.10 "x"][insert c.3 "y"]

Sort those operations by positional order, maintaining causal order. Positional order is a partial order because things like keys are not related. This gives us the tree structure allowed for filtering.

[set x "abc"][set x "efg"][set b 42][insert c.3 "y"][insert c.11 "x"] (positional + causal, by transposition) { x: [set x "abc"][set x "efg"], b: [set b 42] c: [insert 3 "y"][insert 11 "x"] }

Then we can apply normalization to each of the different sets of positionally + causal ordered op lists.

{ x: [set x "efg"], b: [set b 42] c: [insert 3 "y"][insert 11 "x"] } (remove redundant set)

Rebasing a changset recursively rebases. The case where we combine array nodes will require rebase + transposition to restore the invariant of positional + causal order.

rebase { c: [insert 5 "1"][insert 12 "2"]} after { x: [set x "efg"], b: [set b 42] c: [insert 3 "y"][insert 11 "x"] }
= { x: [set x "efg"], b: [set b 42] c: (rebase  [insert 5 "1"][insert 12 "2"] after [insert 3 "y"][insert 11 "x"]) }

where 
  rebase [insert 5 "1"][insert 12 "2"] after [insert 3 "y"][insert 11 "x"]
  = [insert 3 "y"][insert 11 "x"][insert 6 "1"][insert 14 "2"] (causal)
  = [insert 3 "y"][insert 6 "1"][insert 12 "x"][insert 14 "2"] (positional + causal, by transposition of middle two)

In this setting, ChangeSets are not unordered, but rather sets of operations that combine causal ordering and (partial) positional ordering. Even though they are ordered, this still allows for filtering. I have not considered enough to understand the trade-offs with storing at the same base, instead of the order above. One claim is that to systematically implement cancellation, we require operations being in positional form, and we cannot put operations in (true) positional form without their causal order too. (Only a claim). For the case of merging arrays within a changeset, I think we roughly fall back to the primitive of merging arrays of operations, so understanding that seems on the critical path, IMO.

[Edit: silly typo: casual -> causal]

Reacting to Roland's March 31st post

I used the term "cancelation" only with respect to the squashing of two ChangeSets.

I see, so you're using the term to refer to something that happens to operations in changesets, not to refer to the net effect on document state (which is what I was doing). I'll use your terminology for this going forward.

An alternative view is, that squashing an operation and its inverse is supposed to be an empty operation (and not one doing an insert / remove).

By "alternative view", do you mean "a different way of stating the same thing"? If not I'm confused because this sounds equivalent to the earlier description.

either insert + remove of the same value/range or remove and insert, where insert inserts the same data as has been removed previously

Note that in SharedTree "the same data" has to take into consideration node identities. Also, "remove then insert" should never cancel out because insert introduces new anchor points (and remove/delete doesn't remove them in general). Instead, "remove then revive" does cancels out (which I assume is what you had in mind).

So what you call postbase computes the same ChangeSet as the sandwich does

That much was clear to me, the part that I'm confused about is the "rebase+apply" label applied to it: to me, apply is a function that takes a document and a changeset and produces a new document that reflects the changes expressed in the changeset. None of that seems to be happening here (and rightly so). Also, why is "inverse" (and possibly "squash") not mentioned instead? It sounds like we're on the same page as to what postbase is doing, which is really what matters most, but it seems we've got a small terminology mismatch here.

Yes, that's what I had in mind: unique IDs per remove segment

I have now adopted that. In fact, all delete, insert, and move segments all now have unique IDs.

if I'm rebasing edit U over edits [A, B, C], then U' needs to be able to say, for each of its tombstones, which of A, B, or C, deleted the content that the given tombstone represents. This is needed because some other edit V may be rebased over U' in which case we will need to be able to compare the tombstones in V and in U'.

Could you describe in more detail where this is needed during the rebase / what the comparing of tombstones refers to?

Let's look at a concrete example:

Consider a sequence of nodes [A, B, C, D], where the following four edits occur:

e1/e0: delete B
e2/e0: delete C
e3/e0: insert X after B
e4/e1: insert Y after C

The ey/ex notation conveys that edit ey was issued in the context produced by the application of edit ex.

The correct outcome is [A, X, Y, D].

After e3 has been rebased over e1 and e2, it looks something like this:

[
    1, // A
    { type: "PriorDeleteSet", seq: 1, id: 0, length: 1 }, // B
    { type: "Insert", id: 0, content: [{ id: "X" }] },
    { type: "PriorDeleteSet", seq: 2, id: 0, length: 1 }, // C
],

After e4 has been rebased over e1 and e2 (but not yet e3), it looks something like this:

[
    1, // A
    { type: "PriorDeleteSet", seq: 2, id: 0, length: 1 }, // C
    { type: "Insert", id: 0, content: [{ id: "Y" }] },
],

(I omitted frame numbers in the tombstones for brevity, and also because those changes would be single-frame changes which means we'd likely not include a frame number at all)

Now e4 also needs to be rebased over e3. In order to guarantee the correct outcome, the rebase function needs to understand that the first tombstone in e3 (which represents B) is not the same as the tombstone in e4 (which represents C). To accomplish that, it relies on the fact that the tombstones are marked with the sequence number (more generally a commit identifier) of the edit that brought them about.

If instead of rebasing e4 over each edit successively we had rebased e4 over a squashed version of all the other edits then we wouldn't have needed this ability to attribute tombstones to prior edits. I expect that once the basics of rebasing are worked out we'll focus on squashing and therefore will be able to leverage this. But even once we do that, we won't escape this problem entirely because changes are not always rebased in one fell swoop.

The sequence number/commit identifier is also convenient in that it frees us from having to assign globally unique IDs to each op.

Note that even in situations where no explicit constraint is expressed, there is still the possibility of a conflict due to implicit constraints (e.g., "insert in trait foo of node X" will conflict with concurrent edits that either directly or indirectly delete node X"). For this reason, we typically cannot squash across transaction boundaries.

I am not sure, whether this is still true, if you use the default conflict resolutions.

You're right that using default conflict resolution would allow us not to drop the whole edit and would therefore free us from having to preserve the transaction boundaries.

This opens up a pretty big design question: would we consider such scenarios a conflict (in which case silently dropping the insert would entail we're no longer enforcing the atomicity of transactions/edits)? Or would we simply treat it as not conflicted (thereby forcing edit authors to add explicit constraints if they want the whole transaction to fail)?

We've always viewed the atomicity of transactions as a must-have feature, but it's fair to say that we could allow application code to opt out of them. Is that something you're advocating for?

They [frames] indeed would be [needed] because peers need to be able to assess whether a constraint might have been violated, and we want to be able to do so without having to pull data from the history server and either converting the temporal information into spatial information or reifying the tree to apply the temporal changes for that purpose.

So this is more a question of the format the history is represented in and whether and to whom it is distributed than a fundamental difference?

Do you mean that both formats are effectively a representation of the history? If so, Yes. Note that the temporal and spatial formats are still fundamentally different in that the spatial format does not include command metadata. I suppose the next logical question is whether the command metadata could be included too... Something to think about.

What you are describing as frames, sounds actually exactly like the representation I proposed a while back for the temporal history: an array (or tree) containing individual operations, each having operation meta data (command type, parameters), a ChangeSet describing its effect and a set of constraints.

You may be right that the presence of frames makes the changeset format look like what you would get if you recorded the leaf changesets in such a temporal format tree.

I think that frames may end up also being relevant in the other case because tombstones need to uniquely reference changes of prior edits and they might need to refer to frames to do so. [...]

I was hoping that this necessity could be avoided by having IDs on the individual remove segments, not on whole frames

Since I currently consider segment IDs to be unique at the scope of a frame, I need to include frame IDs in tombstones when referring to a prior segment. It may be feasible to broaden the scope of segment IDs so that they are unique at the scope of a commit instead. That's a complicating factor I do not want to deal with at this stage, but that's something we could look into.

If we did that, then we may be able to squash frames together and still be able to have transaction atomicity: we could have commits describe the segment ID for the first/last change for each frame. This requires that the segments IDs be monotonically increasing, which is currently what I have adopted.

You may be wondering how I can get away without assigning an op ID to the delete operations. [...]

This would work for changes in the changeset that is rebased, but if we intend to use "frameless" changesets for the base changes we would probably need to insert the IDs there.

As mentioned above, I have now added IDs to each segment.

Reacting to Roland's April 7th post

The term ChangeSet is quite accurate here, in the sense that it is no longer an ordered array of operations, but instead an unordered set of operations

Note that move operations complicate this picture because they force a partial ordering between the operations. Example: a changeset that inserts node foo and moves node bar under node foo.

I don't think that causes major issues, but it's good to keep in mind when we're trying to formalize the format.

and postbase is part of defining how to transpose, rather than an optimization.

I guess, it depends on what you regard as a "definition". On the one hand, you could argue that when viewed from the outside (the external contract), the definition demanding that squash produces a changeset that is equivalent to sequentially applying the two input ChangeSets defines the operation.

That makes sense. I think our attraction for transpose stems in part from the fact that it allows us to prescribe an algorithm for arriving at the correct outcome, whereas this way of looking at squash only provides us a with characterization of the properties of the squash operation.

Reacting to Roland's April 8th posts

How would your proposed solution 2.b work, if there are multiple changes in a changeset, some of which cancel out and others don't. Would you split this changeset into multiple frames?

If the changeset is composed of a single frame which includes several change segments (some of which cancel out others don't) the postbased output would include three frames as follows:

Changes that undo the changes that don't cancel out
The remote changes
Rebased changes that didn't cancel out

You can think of the case where all changes cancel out as a special case of the above where frames 1 and 3 are empty.

So that is why I asked whether we will have explicit insert/remove within the trait and not just a set.

We've always relied on separate insert and delete operations (as opposed to set/upsert). In the example you describe, the deletion would be made with a set-like range which would cancel-out with the insert in a squash.

would applying a slice delete (from beginning to end) to a non-existing trait be an error? How is such an edit operation expressed? Is this done via the indices of the first and last element (or more specifically via a correspond skip) or by having special values for beginning and end?

It would not be an error. Trait extremities are expressed naturally in the changeset format like so:

[
    {
        type: "DeleteSlice",
        startSide: Sibling.Prev, // from beginning
        endsSide: Sibling.Prev,  // to end
        length: 42, // 42 nodes were included in the slice
        id: 0,
    }
]

The fact that the slice extends from the beginning of the trait is inferred from the fact that there is no node in the trait before the segment (i.e., no offset or inserted content) and that the starting side is Sibling.Prev. It works similarly for the end except that we don't necessarily know, by looking at this changeset alone, whether there are any node in the trait after the segment. This is because trailing offsets are omitted. We don't actually really need to know. It's sufficient to know that the slice extends until the next node or the end of the trait if there are no more nodes.

Is this done via the indices of the first and last element (or more specifically via a correspond skip) or by having special values for beginning and end? I think the latter case would not work, because we could never cancel such a slice delete in a squash (as we don't know, whether there are any more elements within the trait or not).

As you can see above, I'm actually including the length of the slice in the segment, so we can (when appropriate) eliminate the possibility of earlier content being deleted by the slice.

However, as we discussed our the last meeting, there's a reason we still can't drop the delete operation: slice operations make it possible for later insertions to opt into the slice, thereby adopting its effect.

During (and following) our meeting, we identified a number of ways this could be addressed. My preference, at least currently, would be to allow slice ranges to opt-out of this ability to affect later insertions. In fact I'd go as far as saying we should make that the default, which means we'd offer the option to opt in instead.

Reacting to Jack's April 11th Post

I believe @yann-achard-MS's format assumes that within a trait (array), the operations are not from the same base, but assume the earlier op has been applied.

The way I've been thinking of it is that the format largely side-steps the question by eschewing indices in favor of offsets. (Move operations muddy those waters a bit, but only out of a performance concern. It seems like you're not really considering move operations here anyway, which I think is the right thing to do at least as a starting point.)

When pressed to pick a side, I'd actually say that the format represents the operations from the same base. What nudges me towards this stance is the fact that you could apply the changes captured by the insert and delete segments in any order you choose. Am I thinking of "same base"-ness wrong?

It seems nonsensical to have a revive operation with the same base as the operation that it is undoing. We probably hope that these operations cancel out, but we are now relying cancellation, rather than using it as optimization.

It's indeed been my intention to guarantee that these cancel out when appropriate.

I wonder if having changesets maintain causal ordering is more robust.

I want to check whether I'm understanding what "causal order" means in this context: would you say that operations are in a causal order if applying those operations in the order they appear in leads to the correct outcome? Is that the only criteria for causal order?

If so, would it be correct to say that there may be multiple causal orders given a single set of changes (such as [insert 3 "y"][insert 11 "x"] and [insert 10 "x"][insert 3 "y"] in your example)?

I have not considered enough to understand the trade-offs with storing at the same base, instead of the order above. One claim is that to systematically implement cancellation, we require operations being in positional form, and we cannot put operations in (true) positional form without their causal order too.

Are you claiming that representing operations at the same base makes cancellation impossible in some cases? Or just less efficient? I think I'm going to need an example. I wonder if @ruiterr, having more experience with both change cancellation and representing operations at the same base, can also provide some perspective here.

The way I've been thinking of it is that the format largely side-steps the question by eschewing indices in favor of offsets.

I would argue that offsets are further away from same base under the premise that you cannot pick an arbitrary point in an array of segments and get an operation out, but you need to accumulate the offsets prior to find the position, so there is some natural dependency on the earlier segments.

When pressed to pick a side, I'd actually say that the format represents the operations from the same base. What nudges me towards this stance is the fact that you could apply the changes captured by the insert and delete segments in any order you choose. Am I thinking of "same base"-ness wrong?

I have been using array representation significantly, and that may be confusing. Using the context notation we have used earlier, the way I interpret a changeset being in the same base is that the operations contained in the set all have the form: a O * where a is the base context, and * is some arbitrary output context for each op.

In contrast, in my head I was thinking that some of the changes in the set could have contexts that depend on other contexts within the set. Ideally, the "entry point" to the changeset should have the same base to allow filtering.

I want to check whether I'm understanding what "causal order" means in this context: would you say that operations are in a causal order if applying those operations in the order they appear in leads to the correct outcome? Is that the only criteria for causal order?

Yeah, I'm really referring to the 'contexts' or 'inputs' that we have used before. For example:

(causal) [a A b][b B c][c C d]
(same base) [a A a1][a B b1][a C c1]

So you are correct that there can be multiple causal orders given a set of changes, and transpose lets us find those different sets. Though if changes are not commutative / concurrent, then it may be that only one order has the correct outcome.

Are you claiming that representing operations at the same base makes cancellation impossible in some cases? Or just less efficient? I think I'm going to need an example. I wonder if @ruiterr, having more experience with both change cancellation and representing operations at the same base, can also provide some perspective here.

Take a simple case of squash(A, A^-1), where the inverse contains op id referring to A. There is no meaningful way to first put these into the same base state. We might say that there is a special case of squash that takes this and cancels them, but in general, the A and A^-1 could be complex sets of changes where the inverse is hidden away somewhere inside the structure.

If it is the role of the consolidation phase to remove these tucked away inverses, but the consolidation phase depends on all operations being at the same base, then there is a cyclic dependency. I think previously PropertyDDS did not have this issue because it represented inverses using a form that could be expressed at the same base state. I dont' think this still holds with the new inverse ops, and additionally, my hunch is that trying to push inverses to the same base state could also be a source of bugs.

This might also be another instance of us talking at different levels of abstraction, and that the 'internal contract' of ChangeSet is able to handle this. It is the external contract that produces an output at the same base / with the nice form @ruiterr describes. IMO, the internal part is non-trivial and hard to get right, and I think understanding the internal representation is important.

I would argue that offsets are further away from same base under the premise that you cannot pick an arbitrary point in an array of segments and get an operation out, but you need to accumulate the offsets prior to find the position, so there is some natural dependency on the earlier segments.

I think I'm starting to see how you're thinking of it. I agree that the accumulation of offsets (and lengths of deletion segments) is a form of dependency, but I didn't expect we'd be drawing the line there for "same base" considerations. That seems like a good topic for a live conversation if we have time. For now I'll just make the case that the dependency doesn't prevent filtering.

In contrast, in my head I was thinking that some of the changes in the set could have contexts that depend on other contexts within the set. Ideally, the "entry point" to the changeset should have the same base to allow filtering.

Would it follow from this that one could only filter changes such that:

Each change with the same base as that of the changeset can be included or excluded.
When such a change is excluded, changes that depend on it must also be excluded.

Though if changes are not commutative / concurrent, then it may be that only one order has the correct outcome.

This seems to imply we are limited in our ability to put changes in positional order (in other words: when they conflict, causal order trumps positional order), but upon closer examination, I'm not so sure that there are cases where causal and positional orders are at odds. Here are a few scenarios I considered:

Insert/move-in within insert (in a flat sequence): we want to keep track of the fact that the second insert is anchored to the content introduced by the first insert, which introduces a causal dependency. Resolution: split the first insert into two parts such that the second insert fits between them in positional order.
Insert/move-in below insert (in a tree hierarchy): the second insert can only be carried out if the first one has been effected before it. Resolution: since the second insert is represented below the first hierarchically speaking, there is no positional ordering issue.
Swap a parent-child pair: in the spatial format, we can only affect subtrees that are connected to the root, so we're forced to perform move-out changes from the bottom up (and move-ins from the top down). Resolution: same as above.
Move a slice range that covers a region of a trait where a slice delete already took place. Example: in [A, B, C, D], delete slice [B, C] then move out [A, D]. The deletion has to be performed first (we cannot move the deleted slice along and delete it at the destination). Resolution: represent the ranges hierarchically such as the range [B, C] is expressed within the range [A, D].

Was that all implicit in the notion that things can be both positionally and causally ordered?

I wonder if having changesets maintain causal ordering is more robust.

I'll try to restate your key points to check my understanding:

The new inverse segments (revive and return) have, by definition, a different base than the segments they are the inverse of.
Since segment cancellation has historically relied on representing changes such that they are positionally ordered and all at the same base, then the introduction of the new inverse segments is problematic.
Instead of representing changes such that they are positionally ordered and all at the same base, we can represent them as both positionally and causally ordered where the overall ordering is positional first then (for equivalent positions) causal.
If we do so, the work of cancelling segments is made more straightforward/less error-prone.

Is that all correct?

This leaves out the bit about positional vs. counted, which I'm not sure what to make of. Does the counted notation get in the way of the above somehow?

Time for an update.

Where We Are

We have introduced a modular architecture for encoding the merge semantics of a given field kind (see FieldKind and ModularChangeFamily). An application may use any combination of such field kinds in the schema of its document.

Within this architecture, we have added support for value fields, optional fields, and sequence fields. This support is basic in that it lacks certain key features in order to produce correct merge outcomes in all situations:

The rebase does not yet fully enforce the stored schema.
Inverse operations are not fully supported:
- We lack the ability to retain document data that is needed to apply the inverse of "destructive" operations like SetValue and Delete.
- We lack the ability to cancel-out operations and their inverses during composition in some cases.

In addition to the above limitations, we do not yet support the following:

Undo operations
Custom tie-breaking behavior for concurrent inserts that target the same gap/index.
Move operations
Slice ranges
Constraints
Drilldown

Next Steps

Our current focus is the design and implementation of a scheme that allows inverse operations to be fully supported. This works builds on our previous design work on revive/return operations, tombstones, and gap lineage tracking.

In addition to that, we are evaluating how the needs of move operations, slice ranges, and drilldown may require adjustments of our current architecture.

This PR has been automatically marked as stale because it has had no activity for 60 days. It will be closed if no further activity occurs within 8 days of this comment. Thank you for your contributions to Fluid Framework!

microsoft / FluidFramework