share / sharedb

Realtime database backend based on Operational Transformation (OT)
Other
6.12k stars 445 forks source link

Duplicate ops sent over the wire (Was: What does side mean ? transform(op1, op2, side)) #354

Open undeadfrost opened 4 years ago

undeadfrost commented 4 years ago

transform(op1, op2, side) Side is either 'left' or 'right'. what does that mean, when is it 'left', when is it 'right'.

josephg commented 4 years ago

Its an argument for breaking ties. If two users insert text in the same location at the same time, your app needs to decide which user's text goes first. And it has to do that consistently. The standard / easy way to do that in a serialized server/client system like sharedb is to make whichever operation hit the server first "left" and the operation that hit the server second "right".

If you like, imagine two operations:

Operation 1 hits the server first. Then operation 2 hits the server. The server transforms op2' = transform(op2, op1, 'right') to produce op2' = insert 'b' at position 1. This is applied locally, producing the document ab. The server and stores / commits op2' to the database.

The client who sent op2 receives op1, and knows it was concurrent with their own operation. They transform op1' = transform(op1, op2, 'left') to produce op1' = insert 'a' at position 0 (and skip 1 character afterwards). They apply that locally and get a new document ab, just like on the server.

It doesn't matter which way around the left/right arguments go, so long as whenever you have the diamond property in transform, one arm is 'left' and the other arm is 'right'. The easy way to do that is to make the server always use one and the client use the other. (Or the operation with the lower guid is 'left' and the op with the higher guid is 'right' or ... compare clientids or ... whatever). If left/right were the other way around, you'd get ba instead but it would still converge just fine.

josephg commented 4 years ago

We should make a Q&A wiki page for things like this.

undeadfrost commented 4 years ago

In the above example, the server will send op2 to client 1, so is client 1: op2 = transform(op2, op1, 'right')

josephg commented 4 years ago

Great question and - Surprisingly no! From client 1's perspective, the operations weren't concurrent! As far as client 1 is concerned, it submitted op1, then the server acknowledged op1, then the server sent op2' and that just gets applied locally. Client 1 has no idea that client 2's operation was actually concurrent because its the server that does the work of transforming op1 into op1' before forwarding the operation.

In a p2p setting, or using CRDTs, the server doesn't transform. And in those cases, yes - the client would need to generate op2' = transform(op2, op1, 'right'). To figure out which operation is 'left' and which is 'right' it'd need to look at who had the lower client ID (or which op has a lower guid or something) to give the two operations a defined order. But in the server-client model its enough that the server always uses 'right' and the client always uses 'left' (or the other way around) for everything to work.

undeadfrost commented 4 years ago

When concurrent operations occur using sharedb, client 2 always receives two identical op operations. Why?

josephg commented 4 years ago

Are you talking about the wire protocol? Can you show an example? That might be a bug.

undeadfrost commented 4 years ago

The rich text editor I used was SlateJs, and I was inserting images on two computers at the same time, and the op received them twice.

Client 2: The same OP will occur twice

socket socket2

The value of side is different:

console

The server only appears once:

server
josephg commented 4 years ago

I'm not entirely sure. There's a few messages that can cause that sort of thing - when you send an operation and the operation is in conflict / old, its the clients' responsibility to transform and re-submit. There's also separate server-to-client messages for acknowledging the clients' message and for telling the client that its message has appeared in the operation stream.

But yeah, might be a bug. I'll leave people who've been interacting with the codebase more to handle that one.

undeadfrost commented 4 years ago

Although I currently receive the same OP twice, my documentation is correct.It appears to be sent directly from the server.

curran commented 4 years ago

@undeadfrost Re: duplicated ops, can you please share a sample project that ShareDB developers could use to reproduce the bug? It does look like a bug, and the first step towards fixing it is the ability to reproduce it. Thank you.

alecgibson commented 4 years ago

Do the duplicate ops only happen when you have two clients open? Similarly, if you open a third client, do you get triplicate ops? If so, then you're probably re-submitting remote ops as if they're local ops.