Closed erwanor closed 6 months ago
xref #3505
I think that the approach of creating synthetic Begin/EndBlock
s is not a good one compared to doing JMT surgery. While doing surgery on the JMT to edit the version behavior is somewhat tricky, it's self-contained. We only have to do the trickery once, in one place. On the other hand, the synthetic block approach leaks into higher abstraction levels, and interferes with app logic.
Idea: what if we avoid surgery entirely by reading all of the keys out of the old database and writing them into a new one as part of a schema migration?
Then all the keys and values have a single version and we get pruning for free.
That's what I had in mind here:
we will have to build on top of the jmt::restore API, consuming an existing tree state, and using a variation of the add_chunk method to append to the existing logical structure of the tree, rather than completely overwriting it. The jmt::restore API is also needed to implement JMT pruning (https://github.com/penumbra-zone/penumbra/issues/1806) and state pruning is probably a key part of every realistic chain upgrade scenario. The main difference is that the current API requires callers to specify a post-restoration root hash, which we would not be able to provide before the fact.
This doesn't completely get us pruning for free. It does clean up the logical structure of the tree but we will still have to do some extra work to prune the value store, so it's better to track pruning separately as a superset of this ticket.
The plan outlined in this ticket is accurate, what remains to be done is implementing it. The specific implementation differs from the previous ones discussed here (restore the tree, or flatten it and rewrite every key). Instead, we will incrementally append to it. The work has started in a prototype https://github.com/penumbra-zone/jmt/pull/110 and is being validated. Once we have high-confidence that it works, we will be able to merge it, and:
append_value_sets
to overwrite the JMT during migrations
Is your feature request related to a problem? Please describe.
To perform a chain upgrade, a node operator needs to halt their node, export and migrate the node's chain state. Frequently, but not always, the migration will alter the chain's consensus state. In that case, we must have a process that is designed to be as least intrusive as possible, specifically we want:
Our current approach, is to overwrite the JMT at the pre-upgrade version, as follow:
pd
at heighth-1
h-1
h
A basic implementation of this migration flow is captured by the
SimpleUpgrade
migration script. The flow relies on the ability to append data to the JMT "in-place" i.e. without increasing its state version. This process, implemented byStorage::commit_in_place
is currently broken, and using it will result in dangling JMT nodes that are inaccessible if queried viaget_with_proof
. So, while the data for that state version is available in our backing store, we cannot generate proofs for it. This is problematic.A correct implementation of
commit_in_place
, we will have to build on top of thejmt::restore
API, consuming an existing tree state, and using a variation of theadd_chunk
method to append to the existing logical structure of the tree, rather than completely overwriting it. Thejmt::restore
API is also needed to implement JMT pruning (#1806) and state pruning is probably a key part of every realistic chain upgrade scenario. The main difference is that the current API requires callers to specify a post-restoration root hash, which we would not be able to provide before the fact.Describe alternatives you've considered Alternatively, we could rework the chain upgrade process to include an "offline" block
h
:pd
at heighth-1
h
and incrementing the JMT version numberh+1
This would require the offline step to setup an
App
that would be fed syntheticBegin/EndBlock
s. This approach has the benefit of not requiring any JMT change, but creates more testing surface, and higher risk of wrongful validator slashing. For example it would require tracking a "fake" uptime for each validator, using a made up timestamp inserted into to theBeginBlock
contents.