Closed ralexstokes closed 4 months ago
from convo's with @potuz
the entry point for your spare tree hash function https://github.com/ralexstokes/ssz-rs/blob/main/ssz-rs/src/merkleization/mod.rs#L123 And this is the key point that you use this signature for hashing: https://github.com/ralexstokes/ssz-rs/blob/main/ssz-rs/src/merkleization/mod.rs#L210
So your hasher function takes as input 2 chunks which are then combined into a 64 byte block and then hashed. If you could adapt this routine to use a hasher that takes a slice of bytes, multiple of 64 as input, and another preallocated buffer, multiple of 32 as output, so that the hasher hashes continously all 64 bytes chunks and spits out the consecutive 32 byte hashes in the output slice, then you can trivially switch to the vectorized assembly without any cost. The Go version of the algorithm is here: https://hackmd.io/80mJ75A5QeeRcrNmqcuU-g#Solving-1 And the faster version I'm using in ssz++ is here https://github.com/OffchainLabs/sszpp/blob/main/lib/merkleize.hpp#L149
latest:
optimized hashing routines here: https://github.com/prysmaticlabs/hashtree
closing in lieu of #156
A big bottleneck of consumers of this crate is the performance of computing large Merkle trees. There are sophisticated caching techniques we can use to mitigate this; see #17. Separately, we can optimize the hashing itself, as computing the parent nodes of a set of child nodes in one layer of the Merkle tree can be done in parallel. This issue aims to address this latter performance technique.
relevant repo for investigating hashing techniques we may be able to use:
https://github.com/OffchainLabs/sszpp