Closed kdembler closed 5 months ago
Just one small thing to note is that any proposals in-flight will be cancelled at the time of the runtime upgrade.
@kdembler the CRT pallet will be unfrozen by default
looks pretty complete, good work 👍 should we add some upload tests on gleev and CRT features (like buy/sell tokens), after disabling maintenance mode, to make sure it's all working as expected?
Just one small thing to note is that any proposals in-flight will be cancelled at the time of the runtime upgrade.
For the runtime upgrade proposal it seems it goes to a "dormant" state after it get's the council approval
When we reference the "commit" for the runtime .. given that there could be other non-runtime related changes possibly, let use be more clear that it is the "runtime code shasum" produce by this script: https://github.com/Joystream/joystream/blob/nara/scripts/runtime-code-shasum.sh
Just one small thing to note is that any proposals in-flight will be cancelled at the time of the runtime upgrade.
For the runtime upgrade proposal it seems it goes to a "dormant" state after it get's the council approval
yes, that means it is waiting for the 2nd round of council voting, then it will go into grace period and after the grace period when it actually executes all other active proposals will be cancelled.
@freakstatic Thanks, removed point about unfreezing @mnaamani Thanks, updated
Two jsgenesis operated nodes that I will probably also need to ensure are working are the "status" server and the "faucet" server.
Updates after testnet upgrade:
Will upgrade the checklist to reflect those changes
Update on wallet compatibilities:
Updates after testnet upgrade:
- We've discovered that the initial approach with timing the last approval in the 2nd round will not work, because the proposal in 2nd round also have an expiry block. Placing the last vote late enough for it to execute during revealing stage is not possible. Instead, we will use a "trigger block" functionality that lets us set exact execution block at proposal creation time. We have upgraded our testnet using this approach.
We have tested different versions of our software/nodes during an upgrade:
- Both Ephesus and Nara validator nodes were able to produce blocks before and after the upgrade.
- Both Ephesus and Nara QNs have continued to work after the upgrade. The Ephesus indexer experienced a crash as expected, but recovered and is processing blocks.
- ⚠️ The Ephesus QN processor has crashed once we have submitted the freeze pallet proposal as it's not recognized. This means that we should update all critical QN instances to Nara versions before the upgrade executes.
- Both Ephesus and Nara versions of Orion have survived the upgrade but the Ephesus Orion processor has crashed once we created a new channel.
- Seems both Ephesus and Nara faucets continue working fine after the upgrade. I think they both crashed during the upgrade but worked fine after restart.
- Storage Squid, Colossus and Argus continue to operate normally.
- Status server seems to have some small issues, @DzhideX will prepare an updated Nara version.
- Bedeho has mentioned that his CRT pallet review efforts are not going as planned and that he's giving his green light and we should not wait for him with the upgrade.
Will upgrade the checklist to reflect those changes
Great work compiling this 👍 As you mention, both faucet's crashed with closed connection but restart them fixed the problem and they kept working fine.
Some conclusion:
This document describes plan and a checklist of what needs to happen to properly execute Nara upgrade.
Before submitting proposal
Confirm no issues in CRT pallet after code review (@bedeho)First approval term
B
for upgrade proposal execution, with expected UTC dateT
corresponding toB
. For execution, we should target middle of revealing period of termX+1
. This way we ensure that termX+1
doesn’t change duration mid-term and only termX+2
will use council term length from Nara.B = end_of(X) + 1 + 129,600 + 43,200 + (43,200 / 2)
. We may want to adjust to target specific part of revealing period so that upgrade executes during working hours - we're targeting block6,374,350 ~= Monday 26/02 at 13 CET ~= 2024-02-26 12:00:00 UTC
B
(@kdembler)X
(DAO)T
. We don’t expect any action needed on their side. Confirm with Nova Wallet team that their metadata portal will be updated automatically. (@bedeho, @kdembler)Second approval term
X+1
(DAO)T - 1d
, apply Nara changes to GitHub repo (@mnaamani):T - 2h
, put Atlas and Pioneer instances into maintenance mode (JSG, @kdembler)T - 30m
, update public load balancers to point to Nara versions of QNs and RPCs. (@mnaamani , @kdembler)T - 25m
, update status server with the Nara version pointed at updated public infra. (@DzhideX)T
start confetti showers 🎉 (everybody)T + 5m
, confirm everything works as expected and infra is working fine. If yes, stop the Ephesus QNs and RPCs and upgrade to Nara versions. (@mnaamani, @kdembler)T + 5m
, update https://metadata.joyutils.org (@kdembler)T + 30m
, merge Atlas (Gleev) and Pioneer Nara PRs so they start deploying to production. Confirm on localhost that those versions pointed at public infra work fine. (JSG)T + 1h
, re-enable access to Atlas and Pioneer instances. (JSG, @kdembler)After upgrade
Forkless Upgrades
section in Handbook: https://handbook.joystream.org/system/blockchain#forkless-upgrades (DAO)┆Issue is synchronized with this Asana task by Unito