Open mempirate opened 2 months ago
With the latest commit getting rid of heap allocation for chunks and replacing it with an array, benchmarks were even better for hashtree
:
hash_tree_root_hashtree time: [7.8244 µs 7.8345 µs 7.8479 µs]
change: [-4.8425% -4.6332% -4.4083%] (p = 0.00 < 0.05)
Performance has improved.
Found 6 outliers among 100 measurements (6.00%)
2 (2.00%) high mild
4 (4.00%) high severe
Benchmarking generate_proof_hashtree: Warming up for 3.0000 s
generate_proof_hashtree time: [97.430 ms 97.634 ms 97.872 ms]
change: [-13.578% -12.719% -12.096%] (p = 0.00 < 0.05)
Performance has improved.
I think this shouldn't close #156 this PR creates a wrapper that hashes two chunks, not necessarily contiguous with hashtree. As such it only benefits from the padding block prescheduling of hashtree but none of the vectorized/parallel pipelining. This is why you are getting only 16%--20% benefits from it. When hashing two chunks at the time you also incur on copies and allocations for every chunk you hash, something that is utterly unnecessary in most situations.
The approach to actually gain from this library is to hash one entire layer at the time, by passing all the layer, contiguously allocated, to the chunks
slice in the function https://github.com/prysmaticlabs/hashtree/blob/main/examples/basic_usage.rs#L36
You can hash a Merkle tree of base 2^{N+1} by allocating only 2^N for the first layer, and then rewriting that allocated layer on successive runs, by passing the same slice as chunks
and out
.
@potuz agreed this doesn't resolve the whole issue. I will follow this up with another PR for the other performance improvements you mentioned.
Implements a new
hash_chunks
method that is used everywhere and will vary on implementation based on the features enabled.Related to #156 but further optimizations are possible.
Notes
hash_chunks
function, whose implementation differs based on the enabled features. I've tried adding a generic hasher trait and using dynamic dispatch, but this became pretty messy. I think this is cleaner.Benchmarks
sha2 (default)
sha2-asm
hashtree
In these benchmarks,
hashtree
is ~16% faster at generating proofs and ~20% faster at calculating hash tree roots compared tosha2-asm
.