Consider representing a proof as a vector of ProofBlock instead of a linked-list?

FredericJacobs commented 7 years ago

Pros and cons?

romac commented 7 years ago

Pros:

Potentially smaller memory usage.
Potentially faster proof-check times, as iterating over a n-element Vec is most likely faster than following n pointers.
Most likely easier/faster to serialize/deserialize.

Cons:

It makes the proof generation code a bit less straightforward/more error-prone.
An inclusion proof is a recursive structure and it makes more sense to me to represent it as such rather than as a list/vec.

Suggestion:

Let's first write benchmarks, and see where we currently stand space- and time-wise overall, and which code paths make more sense to optimize first. I think it's very likely that the current O(n · log(n)) proof-generation time will dwarf everything else. It is true though that proof generation will usually happen on a different machine/at a different time than proof checking, and it might thus make sense to treat both code-path as different "libraries" and benchmark them individually.

FredericJacobs commented 7 years ago

We won't address this immediately. This will likely get addressed in the future by another way of encoding the Merkle tree structure.

briansmith commented 7 years ago

I submitted a couple of PRs that will hopefully help with this.

As far as my own use of Merkle tree signatures is concerned, it would be ideal if the verification of a serialized proof could be done without using the heap at all. In particular, if we have the serialized proof as a &[u8], then it should be possible to verify the proof without actually building a tree or allocating any memory.

I don't know of any use cases for the signing side where such things really matter.

afck commented 6 years ago

In hbbft I tried representing the proof just as its index and the vector of sibling digests. For us, that reduced the size considerably, and the proof generation (MerkleTree::from_vec) and validation (Proof::validate) code is very simple. I think the latter doesn't use the heap at all, and the former does O(log(n)) instead of O(n²) heap allocations.

I don't think it makes much of a difference in terms of CPU time, though. The bottleneck is probably hashing, and there's still the exact same number of hashes, of course.

SpinResearch / merkle.rs

Consider representing a proof as a vector of ProofBlock instead of a linked-list? #4