hashgraph / hedera-services

Crypto, token, consensus, file, and smart contract services for the Hedera public ledger
Apache License 2.0
316 stars 138 forks source link

Flush virtual node cache to disk before a snapshot is taken #13764

Open poulok opened 5 months ago

poulok commented 5 months ago

There is some data in the state that is not currently represented in protobuf, such as the virtual node cache.

This should be implemented after the migration of state to a single virtual map to eliminate wasted effort (defining protobuf for parts of the tree that will no longer exist once we migrate to a single virtual map).

It may be possible that virtual node cache never needs to be serialized if it can be guaranteed that the virtual node cache is empty on every snapshot. Most of the work of this ticket is determining if this approach is feasible. If state snapshots do not need to be very fast, this is a good option. If it is still required for snapshots to be very fast, then the virtual node cache will have to be serialized to disk in protobuf format and this design proposal will be more complex. Part of the consideration should be the amount of time it would take to serialize the cache vs. flushing it.

This ticket is complete when the decision of whether or not we will flush the virtual node cache on every snapshot or serialize it.

artemananiev commented 1 month ago

I checked current metrics in mainnet and testnet, in particular "copy flush time, ms". None of all virtual maps take more than 500ms to flush on any node