Open JayT106 opened 2 years ago
Shouldn't H
be committed with the old binary?
Shouldn't
H
be committed with the old binary?
Yes, it was committed. but looks like failed because of the error I posted(but this is our operation issue). So the app/consensus state was not be updated.
Summary of Bug
We observed an issue when the cosmovisor ran the previous binary (SDK v0.44.3) and has an error happening during the plan executing height
H
, in our case, we had a file permission issue (it's an operating issue, not the SDK) so the block atH
was not able to commit completely, the app/consensus state will becomeH - 1
. And we see the error like:Checked the
wal
log, the it already stores theend of block height
atH
Later on, the cosmovisor will tries to use the new binary (SDK v0.45.4) to replay with the block
H
after restarting the node. and the store complains it cannot load the versionH
Version
v0.44.3 and v0.45.4
Solution
The upgrade module might need to check the app/state height match with the block height, replay the block
H
with the original release binary and then proceed with the upgrade plan.Another workaround solution will be to let the node can rollback the pending block
H
and restart the node with the original binary, but it looks like not proceed able with the current Tendermint rollback implementation.For Admin Use