Open twoeths opened 11 months ago
Regarding merging ssz state bytes to an existing TreeViewDU state, I have some statistics from this branch https://github.com/tuyennhv/lodestar/blob/tuyen/state_perf_test/packages/state-transition/test/unit/util/migrateState.test.ts#L71C9-L71C9
const seedState = stateType.deserializeToViewDU(data_7335296);
=> it takes 1.3sseedState.hashTreeRoot();
=> this takes 36sconst migratedState = migrateState(seedState, newStateBytes);
=> this takes 0.5s-0.7s for 64 slots different. At this step migratedState
and seedState
share a lot of data, mainly the state.validators
migratedState.hashTreeRoot()
=> this takes 1.5sCachedBeaconState
is ~2.3s given 64 slots difference (base state vs new state)Note this assumes we have a Shuffling cache to save time when creating CachedBeaconState
, otherwise it takes 0.8s - 1s more
Really cool to see you exploring this solution direction!
load state 7335296: const seedState = stateType.deserializeToViewDU(data_7335296); => it takes 1.3s
Do you propose to store the hashing cache in the DB, or load it from a similar state available in memory?
Do you propose to store the hashing cache in the DB, or load it from a similar state available in memory?
@dapplion I'd load it from a similar state available in memory. If I migrate from a mainnet seedState 1 day ago to current mainnet state, it takes ~2.9s
to create CachedBeaconState
which is not too different from 64-slot different load which takes ~2.3s. I think that's due to a lot of validators are not changed overtime, I noted the benchmark result here https://github.com/tuyennhv/lodestar/blob/4a69ce59929ea3065fdf65e75c1d4a88f1922c45/packages/state-transition/test/unit/util/migrateState.test.ts#L66
Problem description
Today lodestar stores up to 96 states in state cache and up to 10 epochs of checkpoint states in checkpoint state cache, along with justified and finalized state. It causes us a lot of memory and node has to restarted multiple time due to OOM in long unfinality period like in early days of Holesky
Solution description
Map<Epoch, Map<RootHex, Shuffling>>
getLatest()
orget()
, load from db/file if neededAdditional context
Storing temp state to level db and later on remove it may cause the db to grow, may consider persisting it to file system and removing them when chain is finalized
Progress