near / nearcore

Reference client for NEAR Protocol
https://near.org
GNU General Public License v3.0
2.32k stars 622 forks source link

1.37 release timeline #10404

Closed posvyatokum closed 7 months ago

posvyatokum commented 9 months ago

This issue is for keeping history of the release.

Actual Timeline: Wed 2024-01-11 - cut the 1.37 branch Wed 2024-01-24 - release 1.37.0-rc.1 on testnet with voting planned for Mon 2024-01-29 Mon 2024-01-29 - release 1.37.0-rc.2 on testnet with voting moved to Mon 2024-02-05 Thu 2024-02-01 - release 1.37.0-rc.3 on testnet with voting moved to Tue 2024-02-06 Tue 2024-02-06 12:00:00 - Protocol version 64 voting on testnet Tue 2024-02-06 14:20:00 - Start of the resharding epoch on testnet Wed 2024-02-07 08:00:00 - Adoption of protocol version 64 on testnet and switch to the new shard layout Tue 2024-03-05 - release 1.37.0 on mainnet with votiing planned for Mon 2024-03-11 18:00:00 Fri 2024-03-08 - release 1.37.1 on mainnet without changes to voting. Added 2567e705c5d98ea7365fe3fbcaff5cacc5f81d4a for OOM, 915aea7d9e45a951d172d771d35cf1a40d3f2338 for stack overflow, and 77f40faf452db2815f5174585b6a0607c332dc60

Planned events: Mon 2024-03-11 18:00:00 -1.37.0 voting date on mainnet Tue 2024-03-12 07:00:00 - start of resharding epoch on mainnet Tuesday 2024-03-12 23:00:00 - 1.37.0 protocol upgrade on mainnet and the start of the first epoch with 5 shards

posvyatokum commented 9 months ago

1.37.0-rc.1 GO or NO-GO

1.37.0-rc.1 is planned to be released on 2024-01-23. See timeline above.

if you are tagged in a comment here, please respond. ๐Ÿ‘ or ๐Ÿ‘Ž reaction to the comment is enough. If you feel like you don't have any thoughts about the release, give a ๐Ÿ‘ anyway (as in "I don't have any objections").

@gmilescu @akhi3030 @walnut-the-cat @wacban @telezhnaya

wacban commented 9 months ago

edited

telezhnaya commented 9 months ago

What is the date when all the tooling should be updated for new testnet version? e.g. near-cli 23/29/31 of January?

posvyatokum commented 9 months ago

What is the date when all the tooling should be updated for new testnet version? e.g. near-cli 23/29/31 of January?

@telezhnaya If the tooling depends on protocol version (which does not seem true) then 31 of Jan. If it depends on release version of our testnet nodes, it should be updated 24-25 Jan, as it is approximately the date when Pagoda SRE's will update all our nodes.

walnut-the-cat commented 9 months ago

other than what @wacban shared, no comments from my end. good to go

posvyatokum commented 8 months ago

1.37.0 GO/NO-GO decision

Status: We are finishing up the mocknet testing of resharding*. If everything goes right, we will release 1.37.0 on Tuesday 2024-02-27 around 18:00:00 UTC. If something goes wrong, we obviously will not.

@gmilescu @akhi3030 @walnut-the-cat @wacban @telezhnaya @khorolets @marcelo-gonzalez

If you are tagged in a comment here, please respond. ๐Ÿ‘ or ๐Ÿ‘Ž reaction to the comment is enough. If you feel like you don't have any thoughts about the release, give a ๐Ÿ‘ anyway (as in "I don't have any objections").

* Our tests include:

I will post testing conclusions in this issue tomorrow. @marcelo-gonzalez please, do the same when you are finished with testing.

marcelo-gonzalez commented 8 months ago

@posvyatokum I posted a longer message on zulip that I tagged people in, but it looks like there's a bug where validators can see different state roots under certain conditions, so I think that needs to be investigated and fixed before we can safely proceed

telezhnaya commented 8 months ago

I also need to add one PR before we release 1.37 I need to change the logic of broadcast_tx_commit back https://github.com/near/nearcore/pull/9644#issuecomment-1965515825

posvyatokum commented 8 months ago

Ok, then we are moving 1.37.0 release one week. Another issue that we will address during this week is legacy archival nodes. We don't think our mainnet legacy archival nodes will be able to be in sync with the chain in time for resharding, their performance is barely faster than the chain itself, so going through resharding is also questionable. We will do an announcement about deprecation of legacy archival nodes ASAP, so that validators have at least a week to migrate to split storage.

telezhnaya commented 7 months ago

This is the commit we need to cherry pick https://github.com/near/nearcore/pull/10655

posvyatokum commented 7 months ago

1.37.0 GO/NO-GO decision

Status: We are running one final test of the whole 1.37.0 release. If everything goes right, we will release 1.37.0 on Tuesday 2024-03-05 around 18:00:00 UTC. If something goes wrong, we obviously will not.

@gmilescu @akhi3030 @walnut-the-cat @wacban @telezhnaya @khorolets @marcelo-gonzalez

If you are tagged in a comment here, please respond. ๐Ÿ‘ or ๐Ÿ‘Ž reaction to the comment is enough. If you feel like you don't have any thoughts about the release, give a ๐Ÿ‘ anyway (as in "I don't have any objections").

Addressing previous issues:

frol commented 7 months ago

@posvyatokum @nagisa @Ekleog-NEAR @khorolets Is the crates publishing also on track?

Ekleog-NEAR commented 7 months ago

I donโ€™t see any run of the crates publishing workflow. @posvyatokum as it seems youโ€™re the release manager, did you know of the recent-ish (a few months, one release ago) changes to the release process that added it?

posvyatokum commented 7 months ago

@Ekleog-NEAR semi-aware, definitely didn't see confluence update. I will add this step to the release template that we keep in github. Is this run correct/sufficient?

Ekleog-NEAR commented 7 months ago

I will add this step to the release template that we keep in github

Thank you! I wasnโ€™t aware of the existence of a release template on github, I guess there was probably a race condition between the migration from confluence to github and my adding it to confluence around late December. Please let me know if I can help updating any other process document to make this a smoother experience!

Is this run correct/sufficient?

No. The process documented on confluence is:

After cutting the release and creating the new branch, the release owner needs to publish the workspace-wide crates, that are versioned alongside neard. The process is:

  1. Bump the workspace.metadata.workspaces.version field in the workspace Cargo.toml on the release branch to perform the publishing, and on the master branch to record the latest published version. Considering we are not careful about backward compatibility of internal crates, you should usually bump the major version of our current 0.major.minor versioning scheme, unless you manually checked the changes.

  2. Run the appropriate workflow with, as parameter, the release branch or tag

  3. In the release notes for the new neard version, record the corresponding version of the published nearcore crates, with a sentence like:

    Synchronously released crates corresponding to this nearcore version were published with version ${{workspace.metadata.workspaces.version}}

Looking at the release branch, my guess is youโ€™re missing the first step here, of bumping to 0.21.0

posvyatokum commented 7 months ago

@Ekleog-NEAR I will reach out to you to check that we are doing everything right with 1.38 But also I need help with figuring out what we do now. Options in increasing order of my personal pain and mainnet risk from resharding perspective

  1. Nothing with 1.37, do everything right in 1.38
  2. 1.37.1 release with bump and crate release next week (after resharding is done, so after protocol upgrade)
  3. Code red 1.37.1 release before protocol upgrade.

I would also like to understand potential risks more. You can just point me to some old Zulip thread.

frol commented 7 months ago

Nothing with 1.37, do everything right in 1.38

New crates matching 1.37 API must be published. Otherwise, we cannot support the latest RPC API changes in our Rust tooling.

I don't think you have to make a new binary release after you bump the crate versions.

Ekleog-NEAR commented 7 months ago

@posvyatokum Iโ€™d say:

  1. Push a new commit to the 1.37 branch, that bumps the crates version in Cargo.toml โ€” no need to make any new nearcore release, it is a non-code change, that will only affect crates.io
  2. Run the workflow with the 1.37 branch as a target

Probably ignore this paragraph: If the 1.37 branch is supposed to be freezed (I seemed to understand we had switched to a tag-based release management scheme, but just in case weโ€™re still on the old branch-based scheme), then any branch would do the trick and you could just run the workflow off the commit hash, or creating a new branch just for it (like the crates-0.20.x branch that we created between two releases, making a crates-0.21.x branch thatโ€™d be just 1.37 + the version-changing commit)

As for the risks, @frol already exposed them just above :)

Ekleog-NEAR commented 7 months ago

@frol All the crates should have been published as 0.21 :)

frol commented 7 months ago

I believe it is time to close this issue. Feel free to re-open if you believe otherwise