orbitdb-archive / ipfs-log

Append-only log CRDT on IPFS
https://orbitdb.github.io/ipfs-log/
MIT License
398 stars 55 forks source link

Explicit Merge Nodes #202

Closed aphelionz closed 5 years ago

aphelionz commented 5 years ago

From: https://github.com/ipfs/dynamic-data-and-capabilities/issues/50

Have a way to explicitly create a merge node when we are merging other replica's nodes. One use-case for that is when a certain amount of heads is reached, we want to have it reduced to just one (the merge node), so that it occupies less space inside a CRDT

aphelionz commented 5 years ago

cc @satazor

aphelionz commented 5 years ago

Perhaps this can be an argument passed to join?

log1.join(log2) //normal behavior, default false, vs
log1.join(log2, true) //or log1.join(log2, { merge: true })

The second line, either option, would create a merge entry to become the new singular head of the log.

cc @haadcode @thiagodelgado111 @satazor @shamb0t

haadcode commented 5 years ago

I don't think we should have explicit merge-nodes (it adds complexity, imo is not a good abstraction, we already have it and it can be formally defined as the union of all heads in the log).

We discussed this with @satazor and @pgte in the original issue and in a call we had before starting the work. What, iirc, @satazor meant with a merge-node was that there should be a way to query the heads of the log which we have as log.heads. @satazor correct me if I remember incorrectly.

I believe we don't need this at all and this issue can be closed, but do let me know if you see it differently 👍

aphelionz commented 5 years ago

Closing for now, then. Re-open if necessary.

satazor commented 5 years ago

The reason of having explicit merge is to keep the list of heads small. Consider a highly concurrent scenario were replicas are mostly offline. If they sync with each other at nearly the same time, the 'log' will have a large list of heads. The list of heads will remain like so until a new add is performed. There are cases in which we may want to explicitly create a merge to keep the list of heads small.

The use-case was discussify, where each comment contains the list of heads. This will be inside a CRDT, which we want to keep it small so that synching and coldboot are shorter because the CRDT state would be kept small.

aphelionz commented 5 years ago

Just to learn more about this proposal, is this simply accomplished by:

// let log1 and log2 be loges with >1 entries each
log1.join(log2)

// let MergeEntry be either an entry with an empty payload
log1.append(MergeEntry) // this becomes the new head