AljoschaMeyer / set-reconciliation

An informal description on how to compute set union between two computers.
52 stars 3 forks source link

New Implementation #2

Closed hoytech closed 11 months ago

hoytech commented 1 year ago

Hey, just wanted to let you know that I made an implementation of this:

https://github.com/hoytech/negentropy

Thanks!

AljoschaMeyer commented 1 year ago

Cool, glad this was of use. Out of curiosity, have you been aware of the implementation used in earthstar?

Just a few days ago I thought about how nostr is essentially a network based on the trivial reconciliation protocol and would benefit greatly from a more efficient reconciliation technique. Do you have any plans on pushing for making this an official part of nostr, or is it only intended as a component of your particular implementation?

hoytech commented 1 year ago

No, I did not know about that implementation, thanks for the link. Maybe it would make sense to have a list of known implementations in your repo? The earthstar implementation doesn't seem to define a wire protocol, unless I'm mistaken. But it looks like it does contain an implementation of a tree data-structure, which is cool.

negentropy is sort of the opposite. It's just the wire protocol and a couple simple reference implementations and tests. My thinking is that more advanced implementations would live in their own separate repos/projects. To me that's one of the main attractions of your approach: that applications can use different internal storage layouts as appropriate.

I actually have the beginning of a copy-on-write tree for caching fingerprints, but the C++ implementation is more efficient than I expected so I'm not even sure now that it will be necessary. After tweaking alignment and such, combining two fingerprints needs just two pxor SIMD instructions. Plus, I think one of the major use-cases will be filter-specific syncs, which I think will need to re-compute the fingerprints from scratch anyway since they are unpredictable non-contiguous slices of the full data-set (I haven't understood your multi-dimensional range query suggestion yet so I'm not sure if it would help).

About nostr, yes, I intend to propose a NIP for negentropy once we've got a few clients and hopefully some more relays integrated. The redundant bandwidth consumed by popular relays now is significant and unsustainable, IMO.

AljoschaMeyer commented 11 months ago

Added a list of implementations at the end of the readme.

hoytech commented 11 months ago

Thanks!