imnotteixeira / dissertation

0 stars 0 forks source link

[Paper] Conflict-Free Replicated Data Types #64

Closed imnotteixeira closed 3 years ago

imnotteixeira commented 3 years ago

https://link.springer.com/chapter/10.1007%2F978-3-642-24550-3_29

Replicating data under Eventual Consistency (EC) allows any replica to accept updates without remote synchronisation. This ensures performance and scalability in large-scale distributed systems (e.g., clouds). However, published EC approaches are ad-hoc and error-prone. Under a formal Strong Eventual Consistency (SEC) model, we study sufficient conditions for convergence. A data type that satisfies these conditions is called a Conflict-free Replicated Data Type (CRDT). Replicas of any CRDT are guaranteed to converge in a self-stabilising manner, despite any number of failures. This paper formalises two popular approaches (state- and operation-based) and their relevant sufficient conditions. We study a number of useful CRDTs, such as sets with clean semantics, supporting both add and remove operations, and consider in depth the more complex Graph data type. CRDT types can be composed to develop large-scale distributed applications, and have interesting theoretical properties.

imnotteixeira commented 3 years ago

Conflict-free Replicated Data Types (CRDT) are a different approach to the real-time coordination of inputs problem in a groupware system.

CRDT can be operation-based [9][10], similar to OT or state-based [11][12].

  1. Letia, Mihai; Preguiça, Nuno; Shapiro, Marc (1 April 2010). "Consistency without Concurrency Control in Large, Dynamic Systems" (PDF). SIGOPS Oper. Syst. Rev. 44 (2): 29–34. doi:10.1145/1773912.1773921.
  2. Baquero, Carlos; Almeida, Paulo Sérgio; Shoker, Ali (2014-06-03). Magoutis, Kostas; Pietzuch, Peter (eds.). Making Operation-Based CRDTs Operation-Based. Lecture Notes in Computer Science. Springer Berlin Heidelberg. pp. 126–140. CiteSeerX 10.1.1.492.8742. doi:10.1007/978-3-662-43352-2_11. ISBN 9783662433515.
  3. Baquero, Carlos; Moura, Francisco (1 October 1999). "Using Structural Characteristics for Autonomous Operation". SIGOPS Oper. Syst. Rev.: 90–96.
  4. Almeida, Paulo Sérgio; Shoker, Ali; Baquero, Carlos (2015-05-13). Bouajjani, Ahmed; Fauconnier, Hugues (eds.). Efficient State-Based CRDTs by Delta-Mutation. Lecture Notes in Computer Science. Springer International Publishing. pp. 62–76. arXiv:1410.2803. doi:10.1007/978-3-319-26850-7_5. ISBN 9783319268491.

Operation-based CRDT are also called commutative replicated data types, or CmRDTs. CmRDT replicas propagate state by transmitting only the update operation, similarly to OT. Replicas receive the updates and apply them locally. The operations are commutative. However, they are not necessarily idempotent. The communications infrastructure must therefore ensure that all operations on a replica are delivered to the other replicas, without duplication, but in any order.

State-based CRDT are called convergent replicated data types, or CvRDTs. In contrast to CmRDTs, CvRDTs send their full local state to other replicas, where the states are merged by a function that must be commutative, associative, and idempotent. The merge function provides a join for any pair of replica states, so the set of all states forms a semilattice. The update function must monotonically increase the internal state, according to the same partial order rules as the semilattice.

imnotteixeira commented 3 years ago

LINKS TO CITE: