hashgraph / hedera-services

Crypto, token, consensus, file, and smart contract services for the Hedera public ledger
Apache License 2.0
315 stars 138 forks source link

Improve state hashing performance #16086

Open OlegMazurov opened 1 month ago

OlegMazurov commented 1 month ago

Problem

State hashing becomes a bottleneck at large scale and high TPS. The bottleneck is due to high disk I/O in VirtualHasher.hash() as it reads precomputed hashes along the paths from dirty leaves to the root. Despite high parallelism, hashing has to compete for disk I/O bandwidth with transaction handling (including cache warming), compaction, flushing, transaction validation by client threads, etc. The CPU load from computing new hashes appears negligible compared to the disk cost.

Solution

Revisit the chunking idea for storing/retrieving hashes from disk with a new twist: we can shift the trade off between CPU and I/O by storing only "leaf" hashes in every chunk and recomputing the "root" hash of a chunk (i.e. recomputing all chunk "internal" hashes), which becomes a dirty "leaf" in the parent chunk. We should noticeably decrease the number of disk reads by having one read per path segment (determined by the height of a chunk) and reducing the number of hashes per chunk read (2x compared with the original chunking design as we eliminate chunk "internal" hashes). Increased CPU load is desirable as we currently underutilize CPU and hashing computation optimizations become viable.

Alternatives

No response

OlegMazurov commented 1 month ago

See Chunkify the Virtual Merkle #5286

jasperpotts commented 1 month ago

Sounds like a interesting idea :-)

jasperpotts commented 1 month ago

There is one down side when we get to state proof construction. We were planning on being able to collect all hashes in the path from a leaf to root. With this suggestion this will take compute of hashes as well as disk reads. That may still be the right trade-off but something to keep in mind.