Understanding conflict resolution

gasparyanyur commented 4 years ago

Hi all :) Today I read that in OrbitDb documentation speaks about the problem of consistency. And also I read somewhere that the orbitdb guarantee that all peers in p2p network at one moment in time will receive the same data in the same order. In fact, this is a good argument to implement many great projects based on orbits. But there is one problem that I can’t solve at all. On what basis does the orbit guarantee that all nodes in a peer-to-peer network will receive the same data at the same time in the same order?

phillmac commented 4 years ago

The guarentee isn't that a node will recive the data in the same order, it is that two nodes will eventually have the same state regardless of the order they recive the data.

gasparyanyur commented 4 years ago

but how is it proved that they will have the same state?
one more question please. there is a node A that sends messages, and after 5 minutes the node B also sends messages. Is there a chance that the node C will receive messages B first and then A messages ? please, considering the fact that the node B replied to the message A

aphelionz commented 4 years ago

Hi @gasparyanyur . Thanks for your patience in waiting for me to get back to you.

@phillmac is right. It's a lot less about when the events are written to the log then it is about how they are processed to create the state of the database. This is accomplished using something called a Lamport Clock, where the "time" of each value is a tuple. The tuple contains the ID of the node, as well as a a monotonically increasing integer value for each write, i.e. 1 2 3 4 5, etc.

Let's step through your example:

Node A opens the app and makes four writes. In the log, they would have the timestamps of sometihng like A1, A2, A3, A4. A4 will be referred to as the HEAD, or latest entry, in A's log.
A signs out.
Node B, in the meantime does its own writes independently: B1 B2 B3. B3 is the head.
B signs out.
Node C signs on. Node A and B sign back up as well.
Nodes A, B, and C would all receive the HEADs from each other, and start stitching together the logs. Once all the entries are received, the Lamport clock can do its work by first sorting by the integer time, and then sorting by the node ID. This allows a deterministic ordering of events from a distributed system.

A great place to dive in deep and learn more are the automated tests inside IPFS-log. All of them are instructive on this topic

aphelionz commented 4 years ago

@gasparyanyur Just checking in here again, is the above explanation satisfactory?

orbitdb / field-manual

Understanding conflict resolution #107