Parallelizable release prep

alice-i-cecile commented 4 months ago

Current release strategy

We need to prepare several parallel-but-distinct documents for each release:

The release notes
The migration guide
The complete change log
The list of contributors

The release notes historically have been created ad-hoc, by looking through the list of PRs merged, and adding a stub section. This has been improved already for the 0.14 cycle, through the addition of the C-Needs-Release-Note label. This will give us a good initial list, but it will need to be checked for omissions, unwarranted inclusions and the ability to consolidate related work.

First, the migration guides are scraped from Github, looking at the C-Breaking-Change label. These are then compiled into a single list using the generate-release tool. Next, the individual migration guide entries need to be manually checked for errors, reversions, and general quality. Unlike the release notes, these may have last-minute additions in order to address critical bugs or problems that required breaking changes.

The change log is a simple list of all PRs, sorted by area, that were merged between the last Bevy release and the current one. This includes those shipped in patch releases as well! This step is largely mechanical, although it has historically been organized by area.

Finally, the list of contributors is a simple scraped, randomized list of the complete set of PR authors.

Motivation

Preparing a new Bevy release is a ton of work, and often ends up bottlenecked on a single central PR with hundreds of comments that's prepared at the last minute.

This responsibility usually lands on a single person, who authors the PR and coordinates the release.

Keeping track of all of the work is painful, but the single most serious problem is how this interacts with parsed / generated migration guide data. Currently, generate-release simply scrapes the relevant section from each PR, compiles it into a list, and then the person running the tool submits a PR with the migration guide in a single file.

If we need to regenerate that list (due to changes), any manual cleanup is lost (unless an inordinate amount of care is taken).

The consequences of this are:

We can't start generating and editing the migration guide until the very end of the release cycle, once we have a feature freeze.
Extending this to the release notes is fundamentally infeasible, as it all needs to be carefully written and revised. Individual PRs simply don't have the context required to frame a feature, and the audience is different.

Core Idea

The core problem here is that both Git and any tooling we might build really struggles to work with highly structured text documents in a single file.

Instead, we should have one folder per document we're producing (release notes / migration guide), and then store each section in individual files. Then, in the central file, import the various component files during the generation of the website according to human-friendly annotations.

This allows us to isolate changes from each other correctly and generate / regenerate sections of the migration guides and release notes independently.

generate-release, in the end, should be very friendly to use and hack. This means a nice CLI, lots of comments, modules, and an extensible design.

Details

Data for each release is organized into 2 folders and 2 stand-alone files. These folders must live outside of the standard content folder: otherwise they will generate stub . By convention, these are stored in a top-level release-content folder. Inside that folder, we will have a folder for each release, and inside of that there will be:

changelog.md: a simple generated file of all merged PRs, including their title and PR number, that can be organized by section either manually or automatically
authors.md: a trivial generated file containing the full list of authors in a random order
migration-guide/: a folder containing the full set of migration guide files, one for each PR that has C-Breaking-Change
- This should be initially generated from the Migration Guide section of each PR.
release-notes/: a folder containing the full set of release note files, one for each area that we want to cover.
- This could be initially generated from C-Needs-Release-Notes PRs.

The data in this folder is then imported into two index pages (one for the migration guide, one for the release notes), stored in the correct directory for the final post just as in previous releases. Each index page describes how the various component files are integrated into a cohesive document. This is done via some text importing mechanism (see Open Questions).

An initial list of files for both the migration guide and releases is generated based on the tags assigned on the bevyengine/bevy repo. Each of these areas gets its own file, with a name that combines the PR title and its PR number.

The generating tool is run periodically, updating the list of files. When a new file is added, a corresponding stub entry is added to the bottom of the Migration Guide / Release Notes index file, recording that it's currently uncategorized.

Each file for the migration guide has the following metadata fields:

PR number(s): multiple may be grouped into a single entry
Title

The body text is used for the advice on how to actually migrate.

Each file for the release notes has the following metadata fields:

PR number(s): multiple may be grouped into a single entry
Title
Author(s): many features have multiple authors that need to be credited

There are several invariants we need to uphold (ideally checked in CI):

Each file must correspond to a section in the index.
Each section in the index must correspond to a file.
Each PR with a relevant label must correspond to a file.

Note that neither the migration guide nor release notes map cleanly onto PRs (even though the converse is true)! Many entries will cover multiple PRs, and a few will have no corresponding PR on bevyengine/bevy as they reflect process, book, or website changes that still need to be highlighted.

Because of this, the process to ensure that each PR corresponds to a section is slightly non-trivial. Rather than trying to maintain some form of index, this should be built dynamically each time the check is performed and compared against the list of PRs that should be included.

Implementation steps

[x] Fix bug with draft posts being published on RSS.
- https://github.com/bevyengine/bevy-website/pull/1162
[x] Create a new folder for the 0.14 release notes, with a stub index file.
[x] Create a new folder for the 0.14 migration guide, with a stub index file.
[x] Change the generate-release tool to generate seperate files for each PR in the folder structure laid out above.
[x] Generate a migration-guide/_index.md file that can store links to the seperate files.
[x] Change the generate-release tool to add entries to the index file when creating new files.
[ ] Add a daily job that scrapes the bevyengine/bevy repo and makes an individual PR for each missing example, populated with the relevant content to start.
- When performing this process, be sure to check that a PR is not already open!
  - Maintainers must be able to merge PRs to the branch of the bot: as a result, it probably makes sense to create a branch directly on the bevyengine/bevy-website repo.
- Each of these new PRs is automatically assigned to a milestone and gets a label.
- The original authors of the work are automatically requested as reviewers.
[ ] Add a new validation tool to generate-release to check and repair the document invariants above.
[ ] Add a CI check to run this validation tool automatically (checking only for missing links, not completeness).
[ ] Document that the validation tool must be run in completeness mode to the release checklist.
[x] Mirror all of these tools so they work for release notes as well as migration guides.

Open questions

How do we actually combine the text nicely using Zola?
- There is the Tera {% include %} tag, though it will cause orphan pages to be generated as well for each PR if they are also within the content folder. -BD103
- First, I'm not entirely sure how we can "[…] in the central file, import the various component files during the generation of the website according to human-friendly annotations." Unless the files we're importing are stored in another repo, or outside of the /content/ directory, or as a separate file format like .toml, then these will show up as orphan, and or invalid, pages which is undesirable I believe. Also the one suggestion of the {% include … %} tag from Tera is primarily meant — as far as I know — for templates, to more or less create reusable widgets, rather than import another .md file. - Trial

Relevant Links

mockersf commented 4 months ago

How do we actually combine the text nicely using Zola?

I think the generate-release could do the combination, so Zola will see just the merged file

BD103 commented 4 months ago

How do we actually combine the text nicely using Zola?

I think the generate-release could do the combination, so Zola will see just the merged file

I worry about unnecessarily large diffs, same thing with Cargo.lock. Furthermore, this may require CI checks to make sure everything is up-to-date.

This could be solved by not generating the final file until everything is done, but that's inconvenient for post-release edits.

alice-i-cecile commented 4 months ago

I've added the "Release 0.14" milestone, and we now have labels for A-Migration-Guide and A-Release-Notes to assign the generated PRs to.

bevyengine / bevy-website