ottypes / json0

Version 0 of the JSON OT type
447 stars 64 forks source link

Undo/redo applied multiple times #21

Closed stas-sl closed 7 years ago

stas-sl commented 7 years ago

Hi!

Based on discussion I understood how to implement undo, but I faced a problem when trying to implement undo/redo functionality.

Let's say we have operations: A B C

Then user undoes C and B: A B C C' B' where C'=inverse(C), B'=transform(inverse(B), [C, C'])

Then he redoes B: A B C C' B' B'' where B''=transform(B, [C, C', B'])

Then he tries to undo B once again: A B C C' B' B'' B''' where B'''=transform(inverse(B), [C, C', B', B''])

but B''' becomes [] (noop), but it should be B''' = B'

Am I doing smth wrong?

const json = require('./lib/json0')

const transform = (x, ys) => {
  for (let y of ys)
    x = json.transform(x, y, 'left')
  return x
}

let a = [{p: [0], li: 'a'}]
let b = [{p: [1], li: 'b'}]
let c = [{p: [2], li: 'c'}]
let c_ = json.invert(c)
let b_ = transform(json.invert(b), [c, c_])
let b__ = transform(b, [c, c_, b_])
let b___ = transform(json.invert(b), [c, c_, b_, b__])

let o = []

console.log(o)                         // [] 
json.apply(o, a); console.log(o)       // [ 'a' ]
json.apply(o, c); console.log(o)       // [ 'a', 'b' ]
json.apply(o, b); console.log(o)       // [ 'a', 'b', 'c' ]
json.apply(o, c_); console.log(o)      // [ 'a', 'b' ]
json.apply(o, b_); console.log(o)      // [ 'a' ]
json.apply(o, b__); console.log(o)     // [ 'a', 'b' ]
json.apply(o, b___); console.log(o)    // [ 'a', 'b' ]

console.log('b_ =', b_)                // [ { p: [ 1 ], ld: 'b' } ]
console.log('b___ =', b___)            // []
josephg commented 7 years ago

Hm, this is a great puzzle. I stared at it for a good while and needed some more console.log's to figure out whats going on.

The object level problem is that when you transform json.invert(b) by b_ it turns into a no-op. That happens because json.invert(b) is only a valid operation on a document containing b, but once b_ has been applied the document no longer contains b. In that moment invert(b) is meaningless.

The real solution is that the second time you try to undo b you actually want to undo the most recently applied version of it - which is b__. In this case you want b___ = json.invert(b__), although in the general case it'll be b___ = transform(invert(b__), [intervening ops]). Or put another way, if you undo, then redo, the thing to add back to your undo stack isn't the original operation b. You want to add the operation that was applied by the redo - aka b__.

I hope that makes sense.

stas-sl commented 7 years ago

@josephg, thanks for your response, I really appreciate it.

I'm still trying to figure out how to better implement your solution: how to store undo/redo actions and when to transform them by remote ops - when they come, or only when performing undo/redo. Hope I will figure out it myself, though things look a bit entangled.

If I understand you correctly, then I think that b__ should be also calculated based on b_: b__ = invert(b_, [intervening ops]) and not like in my example b__ = transform(b, [c, c_, b_])

Though, I have one more confusing thing. In my example, if user first performs 3 ops: A B C

then undoes C and B: A B C C' B'

and then redoes B and C: A B C C' B' B'' C''

i expect to get the same array ['a', 'b', 'c'], but what I get - see below

const json = require('./lib/json0')

const transform = (x, ys) => {
  for(let y of ys)
    x = json.transform(x, y, 'left')
  return x
}

let a = [{p: [0], li: 'a'}]
let b = [{p: [1], li: 'b'}]
let c = [{p: [2], li: 'c'}]
let c_ = json.invert(c)
let b_ = transform(json.invert(b), [c, c_])
let b__ = json.invert(b_)
let c__ = transform(json.invert(c_), [b_, b__])

let o = []

console.log(o)                           // []
json.apply(o, a); console.log(o)         // [ 'a' ]
json.apply(o, b); console.log(o)         // [ 'a', 'b' ]
json.apply(o, c); console.log(o)         // [ 'a', 'b', 'c' ]
json.apply(o, c_); console.log(o)        // [ 'a', 'b' ]
json.apply(o, b_); console.log(o)        // [ 'a' ]
json.apply(o, b__); console.log(o)       // [ 'a', 'b' ]
json.apply(o, c__); console.log(o)       // [ 'a', 'c', 'b' ] --- expected [ 'a', 'b', 'c']

console.log()
console.log(json.invert(c_))             // [ { p: [ 2 ], li: 'c' } ]
console.dir([b_, b__], {depth: 3})       // [ [ { p: [ 1 ], ld: 'b' } ], [ { p: [ 1 ], li: 'b' } ] ]
console.log(c__)                         // [ { p: [ 1 ], li: 'c' } ] ---  expected [ { p: [ 2 ], li: 'c' } ]
josephg commented 7 years ago

If I understand you correctly, then I think that b should be also calculated based on b_: b_ = invert(b, [intervening ops]) and not like in my example b = transform(b, [c, c, b])

Yes that sounds right.

The problem you're running into here is a limitation of this type of OT algorithm. Essentially, when invert(c) is transformed by b_, the relative ordering between the two operations is lost. c__ is, in a sense, trying to insert a new character. It doesn't know whether the new character should go before the 'b' or after it. As far as its concerned, b is a new, concurrent operation too and the relative order of the inserts is arbitrary. This is only a problem when the inserts are at the same location.

If you change x = json.transform(x, y, 'left') to x = json.transform(x, y, 'right') you'll see the opposite behaviour - this example will be correct but you can construct another example where it'll be wrong. (Instead of resolving the ambiguity by inserting on the left, it'll resolve by inserting on the right).

stas-sl commented 7 years ago

Thanks for the explanation, @josephg. I think it makes sense what you are saying. This can be quite OK behaviour when 2 users make simultaneous inserts at the same position - then it doesn't matter whose char will be inserted first and whose second. But when one user makes undo and then redo - it could be quite confusing for users. Do you think it could be handled somehow by more sophisticated undo/redo algorithm? Maybe we should decide which side (left or right) pass to transform more wisely?

I was wondering how other tools, which I believe also use OT, handle such cases. As you can see in the GIFs below they do it as I would expect from user's point of view. Even in more complicated scenario, when second user performs intervening ops during edits and before redo.

In the example user 1 types 'a', 'b', 'c', but after each of chars, user 2 inserts '-', so resulting string is 'a-b-c'.

Then user 1 undoes his 2 last chars 'c' and 'b'.

Then user 2 inserts two additional '--' in the beginning.

Then user 1 redoes his actions and we get string that combines all operations of both users: '--a-b-c' in expected order.

Google docs undo/redo google docs undo/redo

Etherpad undo/redo etherpad undo/redo

Maybe you have ideas how this could be achieved? Do they handle undo/redo somehow specially? Or they track additional info? Or..

josephg commented 7 years ago

The buggy behaviour we're talking about through json-ot wouldn't happen from that sequence of actions. It should handle that case just fine.

The buggy behaviour would require:

Text should say 'ab', not 'ba'.

If that works correctly, try swapping the first 2 actions (so user 2 types 'b' first, then user 1 types 'a', then the rest of the actions.)

stas-sl commented 7 years ago

I tried your scenario and, yes, it fails to restore the original sequence in one of two opposite cases (but not the same across all tools). I also tried it in Prosemirror which uses not exactly OT, but something similar.

When I showed the scenario in GIFs - I just thought of it as a more complicated version of the single user case that I tried before and that failed with ot-json: user 1 makes 2 undos and then 2 redos. In modified(complicated) version user 1 makes the same operations, the only difference is that these operations are interleaved with user 2 actions.

I also checked simple version, exactly one that fails with ot-json, and it works in all tools. From those observations probably it can be concluded that though they use OT, which in corner as you provided, can work ambiguously, but probably they handle undo/redo actions differently or use some tricks?

The buggy behaviour we're talking about through json-ot wouldn't happen from that sequence of actions. It should handle that case just fine.

I also checked more complicated version with user 2 using ot-json, and as you say, it actually works. Which I don't fully understand - why simple version fails, but more complicated works, though user 1 makes the same actions.

const type = require('./lib/text0')

const transform = (x, ys, side = 'left') => {
  for(let y of ys)
    x = type.transform(x, y, side)
  return x
}

let u1_a = [{p: 0, i: 'a'}]
let u2_1 = [{p: 1, i: '-'}]
let u1_b = [{p: 2, i: 'b'}]
let u2_2 = [{p: 3, i: '-'}]
let u1_c = [{p: 4, i: 'c'}]
let u1_c_ = type.invert(u1_c)
let u1_b_ = transform(type.invert(u1_b), [u2_2, u1_c, u1_c_])
let u2_3 = [{p: 0, i: '-'}]
let u2_4 = [{p: 1, i: '-'}]
let u1_b__ = transform(type.invert(u1_b_), [u2_3, u2_4])
let u1_c__ = transform(type.invert(u1_c_), [u1_b_, u2_3, u2_4, u1_b__])

let o = ''

console.log(o)
o = type.apply(o, u1_a); console.log(o)    //
o = type.apply(o, u2_1); console.log(o)    // a
o = type.apply(o, u1_b); console.log(o)    // a-
o = type.apply(o, u2_2); console.log(o)    // a-b
o = type.apply(o, u1_c); console.log(o)    // a-b-
o = type.apply(o, u1_c_); console.log(o)   // a-b-c
o = type.apply(o, u1_b_); console.log(o)   // a-b-
o = type.apply(o, u2_3); console.log(o)    // a--
o = type.apply(o, u2_4); console.log(o)    // -a--
o = type.apply(o, u1_b__); console.log(o)  // --a--
o = type.apply(o, u1_c__); console.log(o)  // --a-b-c
josephg commented 7 years ago

The important part isn't complexity. The bug happens when there's a point in time where:

When that happens its impossible for the algorithm to determine the intended order of the inserts. As a result the algorithm picks using the left/right parameter instead, and you get an arbitrary (but consistent) order.

In the test you did above (the one you recorded with GIFs) the edits are always in different parts of the document. Thats why the bug does not show up.

josephg commented 7 years ago

A system built on top of OT-with-tombstones or CRDTs won't have this behaviour though, because its impossible to truly delete characters - so the old insert position is always recoverable.

stas-sl commented 7 years ago

Thanks for explanation and suggestion, @josephg.

I will definitely look at CRDTs. Though I'm thinking maybe I want to much: support arbitrary json structure, invertabiliy (incl. undo/redo case) and so that it was complete solution from DB to client in browser and reliable enough to use in production. So that I can just work on app logic. Actually ShareDB + ot-json almost satisfied all these requirenents and I was able to quickly implement some app prototype which collaboration capabilities look quite impressive considering how little time it took.

josephg commented 7 years ago

Yes. Other problems: No transaction support, poor support for different databases, and the json0 type doesn't support arbitrary reparenting of items.

You don't want too much - but currently no system will do it all. I'm working on all all these issues with statecraft - which I'm hoping to be a spiritual successor. But its far from ready, and will have its own set of issues. Good luck finding a good tradeoff in all this!

stas-sl commented 7 years ago

Thanks! Superposition of existing DBs is really interesting and ambitious idea!