MinaProtocol / mina

Mina is a cryptocurrency protocol with a constant size blockchain, improving scaling while maintaining decentralization and security.
https://minaprotocol.com
Apache License 2.0
1.99k stars 529 forks source link

Parallel ledger hashing #16053

Open georgeee opened 1 month ago

georgeee commented 1 month ago

PR #15980 introduces a neat trick that makes ledger hashes for a mask (i.e. a block) to be computed in a single function call, hashing is executed layer by layer.

We could go one step further and come up with a batch version of hash function.

As of the moment, we rely on block_cipher exposed from the rust poseidon implementation. Sponge construction as a whole is implemented in snarky. What could be done is the following:

  1. Start using Rust's implementation for poseidon hash (including the sponge construction) // maybe padding/block splitting could be left in Ocaml for convenience
  2. Come up with a new function in Rust interface to compute a hash batch (instead of individual hash)
  3. Use rayon's par_iter in implementation of hash function to utilize multicore capabilities

Motivation

At this stage (after closing #14752, to be precise), stage ledger diff application's cost is dominated by computing various hashes:

Overwhelming majority of cost (>80%) for processing a max-account-update block comes from merge/account hashes. As measured on server, it's around 1.2s for a max-account-update block, and it uses a single thread. When executed on a 12-core server, it has potential to be reduced to ~300ms (napkin math™).

georgeee commented 1 month ago

Additional motivation to work on this item: it could significantly speedup catchup.

Assuming networking communication is perfect (and no bugs of supercatchup are activated), bottom line for catchup is building breadcrumbs, which is almost entirely down to hashing as part of staged ledger diff application. Same holds for loading ledger from persistence (although in this case we could store mask hashes in persistence to avoid the need for re-hashing).

By another estimation (napkin math™), we could improve catchup's bottom line by 50% on servers with >= 4 CPU cores.