cachapa / crdt

Dart implementation of Conflict-free Replicated Data Types (CRDTs)
https://pub.dev/packages/crdt
Apache License 2.0
159 stars 20 forks source link

Add generic crdt interface #1

Closed MarcelGarus closed 4 years ago

MarcelGarus commented 4 years ago

Hi, I'd really love for this to become a generic CRDT package offering multiple types, not just maps. What do you think about having a very generic Crdt interface that can be implemented by loads of types? This package could provide several default types, but over time, more types could be added and developers could create custom CRDTs.

For example, consider an interface like the following:

abstract class Crdt<T extends Crdt<T>> {
  const Crdt._();

  /// Merges [this] and an [other] instance of type [T].
  /// If there are instances a, b and c of type T, these conditions should hold true:
  /// - a.merge(a) == a
  /// - a.merge(b) == b.merge(a)
  /// - a..merge(b)..merge(c) == a..merge(c)..merge(b)
  void merge(CrdtContext context, T other);
}

// Implementing like this:
class SomeCrdtType extends Crdt<SomeCrdtType> {
  void merge(CrdtContext context, SomeCrdtType other) {
    ...
  }
}

The CrdtContext could contain global information shared by multiple CRDTs like a UUID of the current client and maybe UUIDs of other clients.

Several types could then implement this interface:

I'd love to hear your opinion on this. 😊

cachapa commented 4 years ago

Hi, thanks for the proposal.

Can you expand on why you think the current implementation isn't generic enough? The Crdt class itself has generic key and value types and maps are only used as a structured way to export and import (or rather, merge) CRDTs. This is especially useful since CRDTs are intended to be used in distributed systems, and maps lend themselves to easy serialisation, e.g. to and from JSON.

Moreover, the Crdt class is intended to be used with a backing persistent store, which makes it less likely that you'd want multiple instances of a local CRDT for the same dataset.

I'm also unclear on the need for client UUIDs. From the CRDT perspective the client id shouldn't be necessary to resolve conflicts.

Have you had a chance to look at my CRDT server project? It implements a truly generic CRDT intended to use as a test server for any client type.

MarcelGarus commented 4 years ago

Hmm. I was trying to encode the most generic definition of CRDTs in an interface: Any data type with a merge method can be used as a CRDT.

That could be maps, but also simpler types. For example, a counter like the following:

class MonotonicallyIncreasingCounter implements Crdt<MonotonicallyIncreasingCounter> {
  final _countsByClients = <Uuid, int>{};
  int get count => _countsByClients.values.reduce((a, b) => a + b);

  void merge(MonotonicallyIncreasingCounter other) {
    other._countsByClients.forEach((client, count) {
      _countsByClients[count] = max(_countsByClients[count], count);
    });
  }

  void increase() => _countsByClients[myId]++;
}

The counter is inspired by this video.

Users of this library could then compose more complex CRDTs based on these primitives:

class Post implements Crdt<Post> {
  LastWriteWins<String> photoUrl;
  MonotonicallyIncreasingCounter views;
  OnlyAddSet<Comment> comments;

  ...

  void merge(Post other) {
    photoUrl.merge(other.photoUrl);
    views.merge(other.views);
    comments.merge(other.comments);
  }
}

Disclaimer: I am new to the topic of CRDTs, so maybe I'm missing something here.

cachapa commented 4 years ago

From what I understand from that video, he's talking about using grow-only sets, which are necessary if you want to implement something like a global incrementing counter.

This library implements a last-writer-wins approach, which I consider to be a superset of the above type, i.e. you can implement grow-only CRDTs using last-writer-wins.

In that case you'd add operations (e.g. increment) to a growing set and avoid conflicts by using client ids to identify each entry. This puts the burden of computing the operation set on the client, but if that's an issue there are other strategies which can be used to mitigate it.

MarcelGarus commented 4 years ago

Ohhh I see. I was thinking it may be reasonable to create a generic interface that allows multiple approaches to be implemented, but if last-writer-wins is a superset of all approaches, that's of course not necessary. Thanks for clearing that up! :blush:

cachapa commented 4 years ago

No worries, thanks for the interesting discussion. It's always good to question your current approach and consider different use-cases.