near / nearcore

Reference client for NEAR Protocol
https://near.org
GNU General Public License v3.0
2.31k stars 619 forks source link

Implement Efficient Set Reconciliation algorithm #3934

Closed pmnoxx closed 2 years ago

pmnoxx commented 3 years ago

We should implement and test a few different approaches to Set Reconciliation problem.

White paper provided by @abacabadabacaba

https://www.ics.uci.edu/~eppstein/pubs/EppGooUye-SIGCOMM-11.pdf

I found one existing rust library for Set Reconciliation : https://crates.io/crates/minisketch-rs You can find explanation how it works here https://github.com/sipa/minisketch/blob/de98387a83ccd5dd7399333ddcf52e1d4f344e53/README.md

known algorithms

performance inserting 1m elements (3 hashes) - 60ms recover 10k elements - 0.9ms recover 100k elements - 10.5ms recover 1m elements - 195ms

pmnoxx commented 3 years ago

@bowenwang1996

pmnoxx commented 3 years ago

I tested minisketch it works, but we won't be able to use it. They use a variant, which produces a sketch which supports an add operation, but it doesn't support remove operation. This would require to completely recompute sketches on edge removal, so this is not usable for us. https://github.com/eupn/minisketch-rs/blob/master/examples/simple.rs

abacabadabacaba commented 3 years ago

It was me who provided that white paper, not @evgenykuzyakov.

pmnoxx commented 3 years ago

@abacabadabacaba Sorry, fixed.

pmnoxx commented 3 years ago

I did some early benchmarking creating sketch with 1m(32bits) elements using minisketch-rs - takes around 2s my unoptimized code - IBF 1m (64bits) elements - 72ms In the white paper they claim it takes 4s to insert 1m elements

pmnoxx commented 3 years ago

I tested the time it takes to recover the difference, assuming we use 4 hashes. 10k elements - 1.1ms 100k elements - 22ms 1m elements - 284ms

pmnoxx commented 3 years ago

That should be fine, we need to spend 2ms to verify each edge, for 1m edges that would take 30m.

pmnoxx commented 3 years ago

Here is a link to implementation, I also added benches: @bowenwang1996 https://github.com/pmnoxx/near-set-reconciliation/blob/master/sr-playground/src/sketch.rs

pmnoxx commented 3 years ago

@abacabadabacaba You had some ideas during the meeting on how to improve BLT. Would you mind writing them down here?

pmnoxx commented 3 years ago

I implemented full algorithm for syncing. You can see how it works in a unit test: https://github.com/pmnoxx/near-set-reconciliation/blob/21bcdfee1a1c30e11f8df729fc49ebcc7f0af5cf/sr-playground/src/blt_graph.rs#L99 I'll write a document with explanation.

stale[bot] commented 2 years ago

This issue has been automatically marked as stale because it has not had recent activity in the last 2 months. It will be closed in 7 days if no further activity occurs. Thank you for your contributions.

bowenwang1996 commented 2 years ago

4112 is merged