streamich / json-joy

JSON CRDT, JSON CRDT Patch, JSON Patch+, JSON Predicate, JSON Pointer, JSON Expression, JSON Type
https://jsonjoy.com/libs/json-joy-js
Apache License 2.0
696 stars 11 forks source link

Compacting History #628

Open crystalthoughts opened 1 month ago

crystalthoughts commented 1 month ago

Hi - Do I need to be mindful of history length when using this library? For example a document with a lot of deletes? Does it act like an immutable datastructure?

Do I need to create my own 'flattening' algorithm? For example I could take the history n-steps ago, and create a new CRDT based on the absolute value at that point + the n undo steps, in order to create a moving window.

Or am I misunderstanding how they work?

streamich commented 1 month ago

Hi!

It depends on what you are storing, if you are referring the document itself—i.e. the Model—then you don't need to worry about cleaning the history. Essentially there are two node types: (1) LWW (last write registers), and (2) RGA (list CRDTs). The LWW nodes don't store any history, only the latest value. The RGA nodes create very compact tombstones for deleted data, you don't need to worry about them.

Now, if you are storing all Patch patches since the beginning of the document, then it is up to you how much space you are willing to use. The longer history you keep around the more likely you will be able to send the necessary updates to a peer that is missing some patches. If you have a central server, you only need to store the full history on that central server.

crystalthoughts commented 1 month ago

Thanks for the summary :) If I wanted to store 10 Patches behind the current state of the Model, how would that work? Am I thinking about that in the right way?

streamich commented 4 weeks ago

If I wanted to store 10 Patches behind the current state of the Model, how would that work?

You just store the patches then. They have the built-in .toBinary() and .fromBinary() emthods.

import {Patch} from 'json-joy/lib/json-crdt';

const binary = patch.toBinary();
const patch2 = Patch.fromBinary(binary);

Am I thinking about that in the right way?

You only need to store the patches on the node which is expected to get up to speed other nodes. For example, if you have a centra server, you would store all the patches only on that central server. If you have a peer-to-peer app, you would store all the known patches on each peer.