Implemented Batched Proof for multiple openings and Path Pruning

intx4 commented 7 months ago

Description

This PR implements an optimized struct MultiPath which holds multiple proofs for multiple leaves. A MultiPath is generated via MerkleTree::generate_multi_proof(...) which takes an iterable of leaf indexes. MultiPath optimizes the proof size by compressing multiple paths using Front Encoding. To verify, MultiPath exposes the same API as Path, with a verify(...) method which takes an iterable of leaf hashes instead of a single hash. For verification, MultiPath optimizes runtime by using a lookup table to avoid recomputing redundant hashes of nodes. The only caveat to MultiPath is that, when calling generate_multi_proof, the provided array of indexes is internally sorted to maximize compression efficiency. The array of leaves hashes provided at verification time should then match this order. The documentation provides an example code snippet. Files affected:

src/merkle_tree/mod.rs
src/merkle_tree/tests/mod.rs

closes: #124

Before we can merge this PR, please make sure that all the following items have been checked off. If any of the checklist items are not applicable, please leave them but write a little note why.

[x] Targeted PR against correct branch (main)
[x] Linked to Github issue with discussion and accepted design OR have an explanation in the PR that describes this work.
[x] Wrote unit tests
[x] Updated relevant documentation in the code
[ ] Added a relevant changelog entry to the Pending section in CHANGELOG.md
[ ] Re-reviewed Files changed in the Github PR explorer

mmagician commented 7 months ago

That's a great addition, thanks! I left minor comments for now, though haven't gone over the actual algorithm yet. Did you run any benches to compare performance to having individual paths?

intx4 commented 7 months ago

That's a great addition, thanks! I left minor comments for now, though haven't gone over the actual algorithm yet. Did you run any benches to compare performance to having individual paths?

I have implemented the benches to compare sequential generation of proofs vs generating a multiproof and sequential verification vs multiproof verification. There is a big increase in runtime for generating the proofs (for 2^20 leaves we go from ~400ms to generate the proofs to ~2.5s for generating the multiproof) and an increase of 46% for verification runtime (on my Thinkpad X1 gen10). I think the major bottleneck is the path compression and decompression part (I guess when verifying the optimization with the LUT amortizes the cost of having to first decompress, while in the proof generation the bottleneck of compression is more evident). The biggest saving is in the proof size though, from O(n*logn) to O(polylog(n)) where n is the num of leaves.

I don't think there is a trivial way to optimize the generation of multiproof further (or at least I don't see it at the moment). We could optimize the verification by using cfg_into_iter! and using some crate providing a thread safe implementation of HashMap to distribute the verification of each path across threads (after sequential decompression).

intx4 commented 7 months ago

@mmagician I fixed the code in decompress to avoid the extra loop and removed the decompress function entirely to merge the decompression step directly into verification, thus removing another loop. According to benches this has decreased the runtime of around 7% w.r.t the previous implementation with the extra loops.

EDIT: did a similar optimization for compression. I removed compress and pushed compression into generation. This speeds up things a lot (70%)

intx4 commented 7 months ago

It is very strange that batch verification is slower than naive - if anything, by using the lookup table I'd expect a speedup. Were you able to identify which part of the new batch verify is causing this?

the PR has been updated, now in the single threaded case MultiProof Verification is faster (verification test runs in ~2.8s on my thinkpad X1 Gen10 against ~3.8s of single proof verification). For reference, multiproof is only slightly slower for creation of the proof (tests for generation run in ~600ms against ~400ms for single proof generation), but on the other size the proof size is smaller and verification is much faster.

intx4 commented 5 months ago

@mmagician the changes proposed by @Cesar199999 have been merged to the main branch of my fork

intx4 commented 5 months ago

That's great! I think the checks are failing for the same reason as in other repos. @autquis already addressed these in this repo in #136.

I was literally about to ask about the failing tests :D

intx4 commented 5 months ago

That's great! I think the checks are failing for the same reason as in other repos. @autquis already addressed these in this repo in #136.

@mmagician I followed @autquis commit to fix the tests, all good

arkworks-rs / crypto-primitives

Implemented Batched Proof for multiple openings and Path Pruning #130

Description