Memoize transaction-related hashes

georgeee commented 11 months ago

While examining the performance of Staged_ledger.update_coinbase_stack_and_get_data, I noticed that even after optimizations culminated by PR #14643, up to 50% of time is being spent in first and send passes of transaction application.

This was measured on blocks of 128 txs each being a 9-account-update zkapp (deploying 8 new accounts). No events or memo were used, this might have affected the results.

When I tried to make a breakdown of cost-centers I noticed the following pieces to take significant part of the costs:

Derive token id (derive_token_id in Account_id)
Events-related hashing (Zkapp_account.Event.hash and Zkapp_account.Make_events.push_hash)
Hashing of zkapp URI (hash_zkapp_uri in Zkapp_account)

These hashing routines (unlike account hash and merge hash used in merkle tree building) are solely dependent on a transaction in question, hence can be performed before block creation (and hashes memorized).

This would reduce the demand of block window duration, hopefully reducing time of update_coinbase_stack_and_get_data two-fold.

georgeee commented 2 months ago

Now it could shave ~30% I think.

With the increased number of events it could be more than that (need to measure)

georgeee commented 2 months ago

Current state

Recent compatible with #15980 and https://github.com/o1-labs/proof-systems/pull/2394 merged was tested on a o1.yak.finance.

Connected to mainnet, it took 4 minutes to load the frontier (down from 5m11s on 3.0.1 release).

When tested with the test-apply tool it was 1.7s per max-size block, which sums up to 8 minutes. This is a lower bound for catchup-from-scratch procedure. And also a limiting factor (less important than networking and proving) to reducing block window.

A few more optimizations to reduce hashing could be performed (as described by the issue).

Alternatively, some smart parallelization strategies could be employed to run 2-3 hashing procedures in parallel after synchronizing 2-3 ledgers instead of just one we currently synchronize for the bootstrap.

With 6-core parallelism, catchup lower bound could be cut down to 90 seconds, which is acceptable. We also should start keeping masks in DB, to speed-up restart (though for now it acts as an easy way to benchmark performance).

MinaProtocol / mina

Memoize transaction-related hashes #14752