Tree operations - Githubissues

ckeditor / ckeditor5-design

☠ Early discussions about CKEditor 5's architecture. Closed now. Go to https://github.com/ckeditor/ckeditor5 ☠

58 stars 12 forks source link

Tree operations #71

Closed pjasiun closed 8 years ago

pjasiun commented 9 years ago

I was about to implement document.applyDelta to apply linear operations:

retain,
insert,
delete,
add attribute,
remove attribute.

I was creating tests for them to apply basic operations we will meet using these operations:

insert paragraph,
remove paragraph,
insert text,
remove text,
split node,
merge node,
wrap node,
unwrap node,
add attribute,
remove attribute,
insert block element (e.g. image) into the text element,
change node type.

Then I started to think how to handle errors during applyDelta, for example if delta contains opening tag but no closing tag or opening and closing tags cross exiting tags in wrong order? We can prepare correct deltas (set of operations), but we can not be sure that combinations of them are correct. We would need to be able to revert part of the delta which failed, so we need to record what tree modifications were done.

Then this thought appears: what if we use tree operations at the first place?

Definitely we can not have a big set of operations. When it come to combining operations we need to handle all pairs of then. And even if the result of the combination will be simple (one user split text node, another inert an image into that node, it is pretty simple to find where the image should be inserted after the splitting), considering 12 operations listed above we would need to handle 12^2 = 144 combinations (they are not necessarily commutative when it comes to collisions), it would be crazy. Fortunately we do not need so many operations, 5 would be enough:

insert child node,
remove child node,
rebase node (change a parent),
add attribute,
remove attribute.

Having tree operations, the positions of the node can be represented as a path (first child of the second child...) instead of the offset. It means there is not reason to keep the linear data structure, everything can be a tree, where characters are leaves.

Using these operations we can implement all of the higher level operations.

To compress operations we can make them working on ranges, not only singular nodes:

Insert node can take an array of nodes to be inserted one after another, with the same attributes. If these nodes are characters there could be a string instead of array.
Remove child nodes can remove multiple nodes, but only siblings on the same level of the tree, otherwise the reverse (insert) operation would not be a single operation.
Rebase nodes, similarly to remove, can rebase a bunch of siblings.
Add attribute can be applied on the range, but none of these nodes can have that attribute, otherwise reverse operation will not rollback state properly.
Remove attribute can be applied on the range, but all of these operations need to have this attribute with the given value.

What are advantages of having tree operations:

Operation is atomic, it is applied or not, no middle state which could be problematic (in linear model: what if we insert opening tag and get an error before inserting a closing tag)? It also means no wield errors during applying delta (in linear model: what if we apply different style on the opening and closing tag? or apply closing tag different then the last opened one?).
Because tree operations are atomic a single change will be a collection of operations, instead of collection of collections (deltas) of operations.
No offset recalculation. The linear data offset was problematic. On the one hand, we can not cache it - if we add anything at the begging of the document we would have to update all offsets of all nodes. On the other hand, calculating offset ever time is not the best when you think about the performance.
No magic with lengths. With the linear interface length of the node is equals number of children + 2 for opening and closing tag, except for the text node where you have no opening and closing tags and character which have a length equals 1. That is a little messy.
No linear interface and no offset means that the part of the document is a valid document, what is a great news when you start to think about nested editables.
It also means that, if we want to check how some operations changed a document we do not need to copy whole document, we can copy only the modified part. In fact we could have some proxy which will looks like the whole document but return modified parts or branches of the original document for the parts which were not touched. It may be useful for searching in the history of changes (cool feature, isn't it?).
The linear concept use numerous collections and data structures: apply transaction get a delta - collection of the linear operations, operate on the tree structure and return collection of inserted, deleted and modified nodes. If we want to be able to revert part of the broken delta we would need to save also tree operations were made. Additionally iterating over the linear structure returns linear data items what is another structure (different type for opening and closing tag, skipping opening and closing tags of text nodes etc.). So we have more-less 5 data structures. With tree operations the collection of nodes added/removed/modified by the change will be simple a normalize list of operations. There will be no linear data to iterate and not separation collection for reverting delta. So we will have 2 data structures.
Because linear diff is very slow in some cases, we were thinking about a tree diff. Tree diff produce tree operations.

And two, in my opinion most important advantages:

Tree operations would be simpler for features. It is ease to thing about features which control document when you focus on changes applied by these features. But if another feature apply some changes it may be not obvious that "Retain 30, Insert: , Retain: 15, Insert: " wrap your widget into a tag it should not be wrapped. And it is only the very simple example. Think about results of combining remote and local operations. Tree operations should be simpler to check what they influences.
Tree operations will be better for rendering. We could said that we are talking about the document model now, we do not care about dirty DOM, but at the end of the day we will need to render these changes. Having tree operations it would be simpler, and, with the rebase operation, more efficient. Note that DOM elements may have attached listeners and if we remove and add nodes every time something change, we would need to reattach these listeners every time. Or think about some widget which should not be rebuild over and over.

But not being too optimistic it is not an universal solution for all our problems:

In fact we do not get much in terms of combining operations and conflict resolving. For example it still would be unclear if the set of insert and rebases is splitting or wrapping node and how conflicts should be resolved. We could add addition annotations for better conflict resolving (for example that "this is part of the split note" or something more generic, like that this operations is connected with the previous one and if it would be rejected the previous should be reverted too).
We can not merge deltas (operations are unit, we can not merge them). Using linear operations we could merge the set of operations and then combine it with another delta. With tree operations we need to combine every pair or operations separately. This can be a performance issue. If we have two sets, where each contains 10 operation in the linear structure we would have 18 merges and one combine, in the tree model we will have 100 combine.
Tree operations are not well documented. One paper I read recently proposed them too, but it was not a very useful paper. While operations transformations are well documented, tested, proved to work, implementing different type of operations could be... lets said challenging.

In case of troubles with combining changes during the collaborative editing it should be possible to transform tree operations to the linear operations (add child node is a set of retain and insert operations) combine linear operations and transform result back to the tree form. But it may not be very efficient.

I believe using tree operations simplified our document model. We will better understand what is going on, so we will be able to better and faster solve problems we will, definitely, meet.

scofalik commented 9 years ago

That was a good read. I am all for tree operations. The basics of this model may be harded to implement but the advantages you described are very real.

fredck commented 9 years ago

rebase node (change THE parent), Hey you, developer! Let's call it "move" node.

Btw, talking about terminology, let's use the right terms and not start confusion.

We're always talking about "Operational Transformations" (not "operations") and no matter if we have linear data or a tree, we're always talking about OT. There is no default set of operations in OT. The operations themselves are application specific and in our case they serve as tree operations, that's the only difference.

Our operations would still be merged/combined, ofc. For example, two "add attribute" operations may result in a single one. Just like the insertion followed by the removal of the same node. But ofc, most of the operations will not be mergeable, which doesn't bring much limitations anyway.

As a final comment, I have the feeling that conflicts will be better handled by using tree operations, especially when dealing with node splitting and merging.

Well done @pjasiun!

Reinmar commented 9 years ago

+1. We may find some troubles, but it seems that it will be easier to understand them with tree operations than linear operations. Tree operations will be a better experience for us and for end developers and even if we won't be able to solve some edge case situations it may be a win for us if we'll be able to easier solve the most common ones.

scofalik commented 9 years ago

I found an insightful paper on OT using tree data structure: http://www.collide.info/Lehre/SeminarWS0405/DavisSunLu02.pdf. It basically confirms what we felt - that OT should be doable on a tree if we stop using retains and start using path-addresses. This might not be the best way to implement collaborative editing but at least it is something that we can stick with for now. It's documented and we can be sure that we will have a solution.

Unfortunately, we will stil be facing problems described by PJ:

It's impossible (?) to merge two following operations into one. This might (or not) create some performance problems. OTOH, I have high hopes that we will rarely have a situation where there are 10 or more operations pending on both sides.
Complex operations will be groups of basic operations and there might be problems when combining two complex operations done in the same place of the document. OTOH, we will publish this API to 3rd party developers so we will never have a situation where we cover all bases and every conflicting scenario is perfectly managed.

For a note, the complexity of combining a set of operations will be O(mn^2) where n is the size of the operations set and m is the depth of the data tree. Every combine is actually two transformation operations. As described before, for two sets, each having n operations, we will have 2n^2 transformations. The most complex function in transformation is comparing addresses of two operations. An address can be at most m long and the comparison is linear, thus we get to O(mn^2) complexity.

This may feel like a lot, but honestly, m shouldn't be that big. We rarely have a situation in the document where we have multiple block tags nested. Also remember, that inline styles won't be represented by tags but by attributes which even further simplifies tree structure. I suppose that it will be very rare that we will deal with m bigger than 4. Also, m does not scale with the size of document or number of users. Other than that, all the operations hidden behind O(mn^2) are fairly simple (few comparisons and assignments). So the only important factor is the size n.

If we really want n to be a small number we will have to implement operations on ranges. Remember, that each character is a singular node that is a leaf in the tree. The paper describes algorithms for single nodes, but generalizing them so they work for multiple nodes that are siblings should be easy. As we know, any range we have can be split into several ranges so that each of them has only nodes that are completely enclosed in that range (so all children are in that range too). At most, we will have 2 * (m-1) - 1 of those ranges, where m is the depth of the tree. It's hard to say if it is a lot. On one hand, we know that m will be somewhere between 1 to 3, OTOH we want to keep n as low as possible and while I doubt we will have more than 10 operations buffered in the set, I can easily see that many of them will be split into 2 to 4 operations.

Another issue I can see is that the delete operation defined in the paper works like DOM remove, which is detaching the node from it's parent. It makes it easier to implement move operation (especially since the paper's insert operation handle inserting nodes that already exist in the structure) but we might have to take care of detached nodes that are not then insterted anywhere -- we will need to garbage collect them in some way.

Also it is not clear to me whether move should be a basic, atomic operation or it should consist of delete and insert operations. I yet have to find a detrimental effect of spliting it into two operations. If we are to describe move as basic operation, we will need to write 6 functions for it (simple functions, but still).

Then we will have even more complex operations like split or wrap. I will have to think on how to handle them. Of course for now it may be enough to just use the basic operations. And at least we will have a brute way to handle those operations.

Last, but not least, it also not clear to me if undo will work as easy as in linear OT.

As for now, I am ready to start implementing those algorithms in JavaScript, however...

Before I have found and read the mentioned paper, I was also trying to explore a different approach that is both similar to what PJ described at the beginning of his first post (all user's operations) but is fundamentally different.

The reasoning behind it was this: combining operations that are strictly connected with how we model document structure and how DOM works is difficult. When user presses enter, we know that we have to split the node, which we do by creating a node with the same type and attributes and then we move some stuff from the first node to the created one. We can implement this using: insert, change and then one or more move operations which are in fact insert and delete operations. Hopefuly, after all of those operations are performed and combined with some other operations we will have a result that the users intented.

The idea is to focus on user's intensions rather than effects of those intentions. So, instead of resolving split operation and trying to combine this resolved effect with other operations (that, i.e. came from the server) we just mark that in the place X user wanted to split the paragraph (clicked enter key). We could even do this by introducing custom elements that are not connected with DOM but have functional meaning. So we could, i.e., insert Split node into the tree and just propagate one insert operation which is much easier to combine than various multiple operations. So combining and propagating intentions/actions would be easier and only after when we combine them, we resolve those custom-functional nodes and make changes to the DOM. It's almost like we remember where user had selection and what he clicked and then recreate all those selections and clicks on another remote collaborative editing site.

fredck commented 9 years ago

I have doubts about the use of custom elements for the user's intentions, but in general it looks like we have a good understanding about the proper way to proceed now. So let's move on.

I'm also curious about move being a delete/insert pair or not. If it is an atomic operation we don't have to face the garbage collection issue, otherwise it'll be another problem to solve. In the other hand, using delete/insert for it simplifies the amount of operations and logic we have to deal with. So maybe this approach may the first to take.

Btw, talk about potential performance issue, let's not be limited to simultaneous operations performed by different users. The most complex situation will be Undo/Redo in collaborative editing, where one should undo only their own operations.

For example, having user A and B, we may have a undo stack like this: A1, A2, B1, B2, B3, B4... B100. Then A decides to undo A2. What we expect to happen then? I expect the inverted version of A2 to be transformed by all the 100 subsequent Bn operations and then applied to the document.

scofalik commented 9 years ago

On undo: Yes, that's what would happen. I will think if this can be optimized in any way.

scofalik commented 9 years ago

On intensions: These custom elements won't be applied to DOM. That would be just a way to mark user's action in our simplified tree structure. I don't know if you understood that correctly. Those custom elements would be then resolved to "real" operations, but only after all user's actions are combined. That would be just a way for easier combines and custom nodes won't be used anywhere else. I should stick with "node" and not write "element".

For intensions-based approach I had (and still have) some hopes that this structure could be linear, or linear-like and maybe we would be able to merge two following operations.

fredck commented 9 years ago

You're proposing dirtying up the data with "temporary" markers. Maybe it is fine, as long as the data get "locked" during processing and the markers get removed once completed. Still it sounds like hacking, not the right way to do it, but it is just a feeling. Anyway, I'm not voting against nether in favor of it.

scofalik commented 9 years ago

Back to the tree-approach and undo.

There is one more problem I can see. Suppose that we delete a node from a tree and then want to undo that operation. If we removed that node from our structures completely we cannot do it. So we will have to keep everything in the structure, even if we removed it a long time ago. I don't know how problematic this can be.

As far as transforming by 100 Bx operations is concerned, maybe we will be able to somehow optimize it by providing nodes to operations instead of addresses. Or by providing addresses but saving in memory which node that was. Those IT and ET commands mostly just change the address so maybe that's the way our algorithm should evolve?

fredck commented 9 years ago

I assume that the operations stack will save the inverted operation as well... so a delete operation will save an insert operation with a copy of the node that has been deleted. So we don't save it in the data structure.

scofalik commented 9 years ago

Yes, that will happen. We will save a reference to that node with the operation. By "saved in the structure" I mean that it is just saved as a reference somewhere but, of course, it will be detached from the structure. This is dual/similar to what we often do in DOM.

The bottom line is that we will have to keep those references "forever".

scofalik commented 9 years ago

BTW. If we create inverted operation at the same time as original operation, maybe we could transform it when we get a new operation from the server? So instead of doing 100 transforms after user uses undo, we will do a bit of work when we get a new operation. This will make our script run twice as long when we get a new operation from server, but undo will be as fast as normal operation. This is a wild thought though, I didn't think it through :)

fredck commented 9 years ago

Nah... but then you'll have to this for all ops in the undo stack and for every new operation that arrive, while the other way we do it only when undo is requested (on demand).

scofalik commented 9 years ago

When we replace addresses with nodes (and offset if needed) I think we will be able to quickly make undo for insert and delete operations. The real problem starts when you want to undo attributes change.

From what I understand, in our approach, attributes are mostly connected with styling of characters (so we keep there info about font style,size,weight,family, text-decoration, etc.). In CKE4 it is impossible to attach such styles to block element (let's say ). Of course we can add/wrap text in a block element with specified class (so we can use attributes for that).

So, back to the problem, sometimes undoing attributes change doesn't really make sense anymore. Let's say you have fozbar and you apply bold to ozba. Structure does not change, only attributes. Then other editor makes his changes and you end up with: fabocdzbxyarj. I know it is complicated example with a lot of edits, but right now, what really undo should acomplish?

If we saved our original change operation as four operations, each linked to a single letter, then we theoretically can undo them (even all at once if we link all those operations together). But the effect will be funny... User clicks undo and suddenly 4 letters from random places loses bold :).

I tried to see how Google Docs handles it, so I typed some text and bolded a part of that text. Then, on another browser I moved part of the text (D&D) somewhere else. When I undone bold only not-moved part of text got undone. So I tried and moved the other part (on second browser). Then, when I tried to undone bold, nothing happend (Google had an "empty" undo step).

fredck commented 9 years ago

Well, in that case I think we're simply doing with conflicts, which will definitely happen. For that, we must define a standard resolution pattern which will define the behavior. Much probably that's what's happening in GDocs... a conflict is hit and nothing happens.

In any case, much probably we're talking about operations on ranges, which boundaries will get updated during transformation. If the final boundaries will point to an existing range, the undo operation will get applied there. Otherwise, a conflict would be identified and much probably nothing would happen.

pjasiun commented 9 years ago

That is a good new, that some papers confirmed tree-operations model.

Definitely the final document need to be clear and simple. However, I agree that at some point we will need to keep user intentions for better collision handling. For example wrap and split are in fact sets of one insert and multiple move operations. But when you think about merging them they should act differently. We can handle this problem in the several ways: we can have a meta data for operations or have special operations for split or wrap. We can add markers in the document (for the time of the algorithm) or we can keep these markers in some external data structure.

But these are implementation details. Not that important now, when we need to bring the basics, though it is good that we have some ideas future problems now. In the first implementation we need an universal solution. We need to be able to handle all types of collisions, somehow. Then we can start thinking how to handle some special cases better.

About the idea of move being insert + delete operation. The question is if operation keep the node or its copy. On the one hand if we delete an element which has a lot of children we need to keep the whole detached sub-tree in the delete operation to be able to insert it again. Making copy of it may be the waste of time and memory. On the other hand, if the insert operation contain the node which will be inserted into the document and then another operation add some attribute to this node the original operation will contain node with the attribute what may be a reason of the mess. And more changes in that node (attributes, name, children, parent) means more mess. The insert operation should keep a copy of the node to avoid such problems. If the insert keeps the copy it will not be that easy to notice that the pair of insert and delete is the move operation.

There is one more problem with keeping nodes instead of positions. If you add the bold attribute on the range 0-5 which is fozbar, operation should keep only the information that the attribute bold was added on the range 0-5, not that letters f, o, z, b, a, r get attribute. Why? Consider this example:

someone added x in the middle of our text (it will be bolded too what is expected),
then someone move letter z somewhere,
then we undo our "add attribute" operation. "add attribute" operation will be reversed, so will become "remove attribute bold from 0-5", then combined with these two operation:
- after the insert operation it will be changes to "remove bold from range 0-6", what is good, because we should remove bold also from x,
- after the move operation it will be changes to "remove bold from range 0-5", what is good, because we moved one letter from the range. Note that moved letter will be kept bold wherever it is now.

It is a better behavior then leaving x bolded and removing bold attribute from the moved letter z.

And one more topic.

That is true that if we want to undo 3 operations in the history we need to combine them with all later operations. But if we combine 3 operations with 100 it gives us 300 transformations. Note it is less then combining 20 operations with another 20. And it grows linear. However, there are some potential mechanism to optimise such collaborative undo. For example we could normalize operations to keep only minimal number of operations needed to transform document from the last our change to the next one - in fact types and sources on the remote operations are not important.

But to start thinking about such optimization we need to have a problem. I do not know if 300 operation will be a common case and if calculating it will be noticeably long.

I do not think we should over-engineer collaboration editing on this stage. The most important is that we are going in the good directions and we have some ideas for the potential problems. Now we need to code basics and start solving real issues.

scofalik commented 9 years ago

Are you sure that keeping copy of the original node with insert operation is really needed? Can you bring up a real scenario where that would mess up some things and made a result different than intented by the user?

There are two things that we do with operations: 1) Transforming. It does not matter what is the content / attributes of a node when you transform insert operation against other operations. The only thing that matters is the address where the node is inserted. 2) Undoing. In this case when we insert a node and then someone change something on it and when we undo, we will make a remove operation on that node. It is true that we will undo the changed node but when we redo, we will add the node as it was when undoing so it seems okay and this is defendable solution. The only quirk I see is that when we undo our operation (so we remove a node) then the person that changed the node will have "empty" undo step. But this is something we won't avoid anyway.

scofalik commented 9 years ago

I changed my post from "reference" to "copy" -- I make a comment to highlight this in case you are reading the post in your email box, as this changes the whole meaning of the post.

scofalik commented 9 years ago

OK, PJ explained me how it can be messy. Though we can't provide real example, it has a bad feeling, i.e. when you inserted node A and someone else insterted node B into it, when we undo insert A, we will get remove A and transform insert B against it. It is implementation depended if that would insert node B again but it feels messy.

scofalik commented 9 years ago

I've implemented algorithms from the paper to see how fast they work and if implementation is difficult. Here you can play with the implementation in testing environment: https://github.com/ckeditor/ckeditor5-design/tree/op-trans

Note, that not everything may work correctly, but from what I've tested it is pretty good. And what interests us most is speed and AFAICS it is great.

marcelklehr commented 9 years ago

For starters: tl;dr so, bear with me :) I, too, see the value in (dom) tree ot and have started implemented it in http://github.com/marcelklehr/dom-ot I didn't follow any paper, though, so it might be buggy. I still would very much like to join forces, though.

dmitryuv commented 9 years ago

@pjasiun Hi,

Before moving with Tree structure i suggest trying to solve "classic table OT problem", if you will. The problem is:

user A inserts a Column
user B concurrently inserts a Row

The result should be a consistent table with +1 row and +1 column.

Also in case of DOM OT editor must guarantee (i guess CKEditor does) that DOM representation is independent and consistent across different browsers.

Another option I was researching recently is to have JSON (you can peek at https://github.com/ottypes/json0) for top-level structure (define paragraphs and other important blocks) and use rich-text based OT (like https://github.com/dmitryuv/otdartlib/) for working with formatted text inside these blocks. Additionally, tables will require special 2-dimensional operations to handle & resolve concurrency.

pjasiun commented 9 years ago

Also in case of DOM OT editor must guarantee (i guess CKEditor does) that DOM representation is independent and consistent across different browsers.

This is why I do not talk about DOM OT, but about Tree OT which will transform the Document Model. This Model will be then transformed into the DOM, but Document Model is abstract and browser independent.

Before moving with Tree structure i suggest trying to solve "classic table OT problem", if you will.

In fact there are more collisions which are hard to solve without the context, without knowing the user's intention. One of the solutions may be adding some meta information about the change (change is a set of operations). So the change will be a set of atomic "insert element (cell)" operations, but on the other hand it will have a meta information that it is "insert row". Thanks to this "labeling" a conflict with the "insert column"-change will be handled in the special way and other while other operations will be transformed in the standard way. Thanks to this, we won't need to resolve every conflict separately. (We won't have to write a separate logic for transforming insert-row against change operation or remove operation).

This is just one of ideas. Another is to represent table in Tree Data Model in a different way than it is in DOM. So the table would be a node that has attributes rows and cols. Then each cell that has some content is represented as a child. Empty cells won't be nodes in Tree Data Model. This is convenient on Tree Data Model side but may be difficult to handle in real dirty-DOM operations.

Another option I was researching recently is to have JSON (you can peek at https://github.com/ottypes/json0) for top-level structure (define paragraphs and other important blocks) and use rich-text based OT (like https://github.com/dmitryuv/otdartlib/) for working with formatted text inside these blocks. Additionally, tables will require special 2-dimensional operations to handle & resolve concurrency.

I talked with @fredck last month about such solution and I do not like it because tree (structure) operations and text operations are not totally separated. You need to solve cases like split node, moving part of the text or inserting tree structure into the text (inline widgets). I am afraid that the final solution would have too many pairs of operations to handle. But I agree that there are advantages of such structure too.

pjasiun commented 9 years ago

For starters: tl;dr so, bear with me :) I, too, see the value in (dom) tree ot and have started implemented it in http://github.com/marcelklehr/dom-ot I didn't follow any paper, though, so it might be buggy. I still would very much like to join forces, though.

We already checked your repository last week. It is very interesting but, to be honest, it is hard to said how it will fit CKEditor 5 needs at this stage.

dmitryuv commented 9 years ago

I talked with @fredck last month about such solution and I do not like it because tree (structure) operations and text operations are not totally separated. You need to solve cases like split node, moving part of the text or inserting tree structure into the text (inline widgets). I am afraid that the final solution would have too many pairs of operations to handle. But I agree that there are advantages of such structure too.

Agree, because of this we actually use linear rich-text OT right now with placeholders that can hold widgets (a subdocuments that can use different OT type, like JSON). This creates a set of limitations for the possible document complexity but easier (and requires less resources) to handle. For example, there is no split into nodes operation, or operation to move text between nodes. If i want to work with tables, it's a subdocument with own structure. Image is a placeholder with image URL as an attribute. And so on. Where it may get hairy is complex nested rich documents with tables in tables and like that, but I don't target to support it right now, pareto principle :)

I have working prototype for rich-text editor based on CodeMirror because it have most powerful API to watch for changes and manipulate content. It supports basic formatting, lists, images & widgets. But lacking of spell checker is a deal breaker for going to production, so I was very encouraged seeing such discussion around CKEditor.

Also I know guys who work on office suite and collaboration based on OT (they also mentioned they use tree-based structure). I'm going to reach them to find out if they're willing to share more details.

scofalik commented 9 years ago

I updated https://github.com/ckeditor/ckeditor5-design/tree/op-trans with complementary set of algorithms and tests that uses node+offset as an address.

Compared to root+path addressing the new operations are much simpler when it comes to transforming them, there are also less edge-cases. They will probably also be easier to handle when creating features/plugins.

On the other hand we will still need some arbitrary addressing, like root+path, if we want to pass operations between client and server, because we have to pass strings. So the node+offset addressing would be for internal use. I have to think if conversion from root+path to node+offset will make problems when passing operations between client and server.

Another option is keeping some kind of nodes' IDs but it seems bad...

scofalik commented 9 years ago

As far as I understand, we will have have a trouble with converting those two address notations.

Say we have a document in state A and we create and apply operation O that changes document structure to A'. Then, if we get a remote operation with root+path address that applies to document in state A we will get wrong node if we convert root+path to node+offset using structure A'. So we would have to remember structure A. It won't be that difficult because, AFAICS we would need just one structure remembered - the one that is last known structure of the document on the server. But it is some complexity gain.

As I proposed earlier, we could create a unique ids for nodes during their creation. Then, if we send operation to the server it would be something like remove 58394 2 where 58394 is a unique id. Whenever new node is inserted into structure, we create id for it and also pass it to the server: insert 222 0 text a 342 where 222 is the node we insert into, 0 is an offset, text a represent the type of node and the content and 342 is the ID of the new node.

Node+offset notation has its merits (mostly simplicity when it comes to defining OT algorithms for typical operations like insert remove or change but also for the new ones we are about to implement, like move). However, I think that if we lose some complexity in one place (OT algorithms) we gain it somewhere else (client<->server communication).

scofalik commented 9 years ago

After a discussion with @fredck and @pjasiun we decided to go with path's because of difficulties we would have to face with other approaches.

scofalik commented 9 years ago

While designing and working on tree operations we came up with multiple issues and scenarios beacuse of which we had to made some decisions or changes to algorithms. Because the number of such cases is growing it is a good time to document some of them.

1) Undo should transform against transformed incoming operations. Situation sketch: START: zz. USER1: insert 'a' at 1. USER2: insert 'b' at 1 -> insert 'b' at 2. STATE: zabz. UNDO1: remove from 1. If UNDO1 is transformed by original USER2 it will become remove from 2 which will remove b not a.

2) Site operations should keep they most recent transformed version in case of getting operation with old document version (also see 5). Situation sketch:

3) Move out from removed node. Situation sketch: Problem: The transformed move operation can't get a proper address pointing to the moved/inserted node because it is no longer in main tree so there is no valid path leading to it. Solution: In operation, we keep both path and root of the detached tree.

4) Move into removed node. Situation sketch: Problem: When we undo removing (or a feature will add removed node again to the tree) we will lose information about incoming operation. Problem: The transformed operation can't get a proper address Solution: Make operations on detached nodes even though they are not visible.

5) Inserting into removed range. Situation sketch: Problem 1: Cannot create transformed remove operation that will move document into desired state. Problem 1: Undo on left side moves document into not desired state (zxyabz instead of zxabyz). Problem 2: Undo on right side does nothing. Problem 2: Unexpected and permanent content loss (on neither side undo re-enters 'ab'). Solution: Split incoming operation into two operations if needed. Problem 3: Still incorrect undo. Solution: Instead of reversing original site operation and transforming against incoming operations, reverse most recent transformed version of undone operation. So in this case, we will have undo = ins 1 [x], ins 4 [y] instead of ins 1 [xy].

scofalik commented 9 years ago

Due to various issues connected with tree-operations on ranges we will introduce a different approach to removing (ranges of) nodes from data model.

tl;dr: remove now moves node to the special tree. See image below.

Problem: In current approach when there is text abcd and User A removes abc while User B removes bcd we are unable to retrive bc on undo. Even though both users have the same state of document and can continue their editing. Unfortunately, undo is fundamental feature that will be a base for many other features, so we really have to put our focus on it.

Similiar problem occurs when User A sets bc to bold and User B removes cd. Then, users don't get to the original state of document when both of them undo (AFAIR there was even a problem with getting to the same state of document).

Solution: Earlier, we discussed that operations on ranges act a bit like there was an artificial, "invisible" parent for all nodes in the range. For example, removing n nodes that are siblings is like creating an artifical parent for them, appending them to the parent and removing the parent. The idea seemed to solve some problems but after all we decided to implement other ideas that were simpler than creating and maintaining special "artificial" parents.

Since those simpler ideas aren't perfect I revisited the "artifical" nodes idea but went a step further. Most issues occur when User A does something and User B removes a range of nodes. Instead of introducing "artificial" parents for removed ranges I propose creating a special root (let's call it graveyard root). This root will be one the same "level" as main document root but won't be rendered. In fact, it will serve as a store for removed nodes.

So, the core change introduced by this idea is that remove operation no longer just detach a node from it's parent, creating a detached tree in tree model. Now, remove operation will create an "artificial" parent in graveyard root and move those nodes to that artifical parent. So, in fact, we won't event have remove operation on the OT level - it will be substituted by insert + move. (BTW. we still haven't decided if we really need the artificial node, we might just move the nodes to graveyard tree, but right now artificial nodes seems safer).

Thanks to this idea we achieve a few things:

Operating on removed nodes is easier and more powerful. Instead of operating on multiple detached trees we operate on a range in a different tree. It is easier to have one solution for removed and not-removed nodes. It also makes undo manager more powerful. Previously, we were able to correctly undo any set of move operations done by other users. We keep the same solutions and algorithms for move operations but now they also "handle" remove operation. This solves all issues connected with remove operation. Notice, that previously when you removed a range of n nodes, it made n detached trees. If another user made a change operation on the same range, we would have to create n change operations, each changing one of the n detached nodes. This was very cumbersome. Now, we simply apply the same change operation but on moved set of nodes.
We don't have to synchronize detached nodes between users' clients. Previously, we would have to create, keep and send unique ids that would identify detached nodes. Any operation send to another client, that would address a node in detached tree would have to include that id. Now we will have just two detached nodes: document root and graveyard root. Since Operational Transformations guarantees us that all trees in all clients are in the same state, our graveyard tree is "automatically" synchronized across all clients. Nodes that are removed have their own, unique addresses in graveyard tree, so we just have to send the information whether the operation's address is in document tree or graveyard tree.
Managing size of undo history is easier. We can easily compute the size of the graveyard tree (i.e. by counting child nodes). If it is too big we can drop some parts of the tree and undo steps that are too far in the past won't do anything (credits to PJ for this idea.).

Since the new idea is pretty ground-breaking a lot of code has to be changed. But this new soultion to removing nodes is bringing many advantages and it is profitable to introduce it. Right now I estimate that there is around 50% of work done in this matter.

We are also keeping the solution that one operation when transformed against another operation may, in some special cases, produce two operations. For example, when we have a text abcdef and we inserted xyz between c and d we will have text abcxyzdef. But then we get a change operation on bcde that are now spread apart. While transforming incoming change operation against insert operation we split it into two change operation, one concerning bc range and the second one concerning de range.

scofalik commented 9 years ago

I pushed branch op-trans with expanded set of algorithms that implements:

processing ranges
the new approach to removing nodes (moving to graveyard tree)

While writing tests I documented all the cases from tests to have a reference for myself, to see if I haven't omitted anything and if cases from tests are described and coded correctly. It does not mean that everything is 100% correct for sure, but those scenarios should work fine.

At the same time, it's a documentation of how editor contents will be transformed when different scenarios happen so it is a description of collaborative editing UX. It's also a short summary of our OT research. I'll just post that stuff here. Scenarios are mostly in the order from the code (except of change x change transformations). Of course I cannot guarantee that they will stay this way.