[PROPOSAL] Changelog and release notes process

ashwin-pc commented 1 year ago

Based on the goals of the changelog RFP here is an improved process to maintain the changelog.

1. Overview

There are 3 pain points with the existing changelog process that this proposal addresses:

Creating a changelog entry is tedious (Needs a second commit and is manual)
Keeping the changelog accurate is a pain (This contributes to the churn while creating the release notes). Both in main and in the version branches and is prone to errors.
Merge conflicts since they often target the same section of the file.

My proposal for addressing these issues involves the following high-level process:

Split the changelog into change sets and a changelog. The change sets contain all the changes associated to a single PR while the changelog contains the change sets that are rolled up into it during releases. Essentially a collection of release notes.
Ease the process of creating the change set by automating it to be constructed from the PR body (Optional but preferred, details mentioned below)
Document how the changelog process works, what our changelog should and should not contain and where each entry needs to reside across the various branches to remove ambiguity across versions.

2. Plan

2.1 Split the Changelog

The first step will be to introduce the changeset folder and refactor the changelog file

2.1.1 Change set

Create a /changelogs/fragments folder to store each change set. Each file in this folder will carry a change set in the .yml format. Ref: Ansible

Sample changeset file: [short_name].yml (name of the file is irrelevant as long as it does not conflict with the name of another file here)

bugfixes:
  - This is a sample fixed 

feature:
  - Introduces a new feature

These changeset's are essentially the unreleased changes for the project. When these changes have made it to a release, they will be removed from the fragments.

2.1.2 Changelog

The changelog will be updated to remove the Unreleased and 2.x and 1.x sections. All released version changes will remain. The changelog will now act as a tracker for all the released changes with a link to the changeset fragments for unreleased changes. If necessary to view the changelog with the unreleased changes too, this can be generated by the user using the release note script.

2.2 Easy changelog entry

To make the process of creating a changeset easier during the PR process, we will simplify the two commit process down to a single step.

2.2.1 The manual way

Along with the change, add a new file to the /changelogs/fragments folder following the changeset template from before. It does not need the PR number to be associated with it since we will only allow one changeset per PR, so the contents of the file will automatically be associate with only a single PR which can be used to generate the changelog as required. This should reduce the burden of creating the changelog from a two step process to a single step. This also has the added benefit that the changelog entry will not be a copy of the PR title, which forces the PR author to be deliberate about the changes they are making.

The new process of adding a changelog entry needs to be documented with a link to it added to the PR template.

Note: The one downside to this approach is out of turn changes. If a changeset is modified after it is created, the commit history for the file will no longer point to the PR that created it. There are two ways around this issue. One is to just use the automated way since the automation has access to the PR and can link it without any additional effort. Or alternatively, out of turn changes need to add extra metadata to the changeset that indicate the original PR that triggered the change. Of the two, I recommend the automated approach, simply because it is trivial to implement and also has the added benefit that it adds the PR information directly to the changeset file. It also make the release note generation script a lot easier to implement.

2.2.2 Automation (Preferred)

The step can be further simplified using github actions. We can introduce a custom action to add a changeset based on the contents of the PR template and the template itself can be updated to have a section to highlight the chages that the author is introducing. The action can be triggered either automatically when the PR is raised, or via a label like Add Changeset. The updated template will look something like this:

// ... other PR sections

## Changelog
<!--
Add each of the changelog entries as a line item in this section. e.g.
- fix: Updates the graph 
- feat: Adds a new feature

If this chnage does not need to added to the changelog, just add a single `skip` line e.g. 
- skip

Valid prefixes: breaking, deprecate, feat, fix, infra, doc, chore, refactor, test
-->

A proof of concept action that implements this

Action: https://github.com/ashwin-pc/test-github-action
PR that utilizes the action: https://github.com/ashwin-pc/Opensearch-CSV/pull/2

2.3 Documenting the changelog process

With all these changes it becomes important to clarify what the changelog will actually contain and how do we keep it accurate across the different versions. We should also ideally avoid a significant rewrite to the existing changelog. So what does the new changelog look like?

All unreleased changes go into changelog/fragments as their own changeset. One changeset per PR
The changes along with their changeset's are back ported to the respective version branches.
When we create a release, (e.g. 2.7) in the 2.x branch, we generate release notes using the changeset, delete the changeset files and add the release notes to the top of the latest CHANGELOG.md. We then forward port these changes to main which will simultaneously delete the changeset files and update the main changelog.

3. How does this meet our goals?

Quick release notes: The changeset and the release notes script makes this process trivial.
Changelog contents need to match release note format for component level updates: Same as above.
Transparent to versioning: There are 2 parts to this. unreleased and released changes.
1. Unreleased changes: This goal is met since the changeset moves with the change itself.
2. Released changes: Since these changes come from the release notes, a process that we already follow for releases being forward ported to main, this process remains unaffected too.
No changes to back-porting: Since the changeset's are always kept as individual files, they avoid merge conflict issues and work with all existing back port processes.
Contributor experience should be improved: Even with the basic approach where we have no automation, this process is improved by not having to make multiple commits for a PR. The added automation will make this process much easier.

4. FAQ

Do we still need to run the Changelog verification workflow in the PR? Yes, since the proposal does not add anything to automatically validate that a changeset is present in the PR, but the verification action will need to be modified to check if the PR adds a changeset file instead of a changelog entry. This is fairly simple to implement and can also be done by forking the existing changelog verifier and modifying to to check for the changeset.

dblock commented 1 year ago

I really like the ideal of PR labels to generate the change sets (or CHANGELOG). So could we avoid having the change set generation be added to the PR every time?

Assuming all pull requests are labeled and that we know what went into a release (baseline), I think a bot should be able to generate the change sets without having to append them to every PR.

Know the baseline.
For each branch, sorted in order of releases:
1. List all unreleased commits.
2. Map commits to labeled PRs.
3. Remove any PRs that are part of an earlier release number.
4. Create the change set from the remaining PRs.

ashwin-pc commented 1 year ago

The goal for appending them to the PR is so that the changeset is tracked with the code itself. so if we revert a change, the changeset also gets reverted automatically, and if we backport, the changeset gets backported too. Basically The changeset travels with the change itself keeping the two in sync and honest with each other

andrross commented 1 year ago

The changeset travels with the change itself keeping the two in sync and honest with each other

I really like this property of solutions like this. Cherry picking commits is the norm with our branching strategy, so ensuring the changeset is tied to the commit and goes with it (and is reverted with it) makes this strategy work really well with our existing processes.

joshuarrrr commented 1 year ago

General feedback:

I'm super excited about the idea that the atomic "change entries" don't require a second commit, and can be written as part of the PR template. I also think the question is presented at the correct time (in the automated proposal), which is as part of submitting the PR and filling out the description.

Questions and comments:

How do you envision change entry collaborartion and editing in the PR review lifecycle? It seems like one advantage is that maintainers could simply update the PR description directly and retrigger the workflow. But, alternatively, contributors or maintainers could actually request changes in the changeset as part of revision commits. If we go with the automation recommendation, I'd suggest also specifying that changesets should always be generated by the automation and not edited manually. That will avoid confusion an make sure the PR description and changesets stay in sync.

name of the file is irrelevant as long as it does not conflict with the name of another file here

Is there existing tooling to prevent changeset filename collisions? It seems a bit more straightforward for an automated workflow to manage, if it's responsible for generating all the actual changeset files.

If necessary to view the changelog with the unreleased changes too, this can be generated by the user using the release note script.

Yeah, or we could even automate that, too, as some sort of recurring consolidation commit.

Valid prefixes: breaking, deprecate, feat, fix, infra, doc, chore, refactor, test

When implementing the template, we also probably need more guidance on how to identify which type to use

We then forward port these changes to main which will simultaneously delete the changeset files and update the main changelog.

This is an implementation detail, but, ideally when we do this we can also automate and avoid merge conflicts.

joshuarrrr commented 1 year ago

In the working group meeting on 5/4/2023, we unanimously voted to adopt this proposal (the "2.2.2 Automation" variation specifically). The next steps are to expand and polish the prototype: https://github.com/ashwin-pc/test-github-action and to create the script that moves changesets to the changelog, and cleans them up.

BigSamu commented 1 year ago

Hi all,

As you can see in my last thread above, issue #5509 has been open for an implementation of this proposal under the OpenSearch-Dashboards repo. After the successful competition of this PoC, the solution will be migrated, so all other repos under the OpenSearch Project can start implementing this process. We will keep you updated.

Regards,

Samuel

BigSamu commented 11 months ago

Hi all,

Just some heads up regarding the progress of a solution for this proposal.

Initially, the PoC developed considered the use of a customized script saved in the OpenSearch Dashboards repo (generate_release_notes.js file) and a reusable GitHub Actions workflow hosted in an external repository. A PR was opened to integrate these changes into the OpenSearch-Dashboards repository.

However, due to security constraints and the evolution of an idea for a general-purpose bot to automate any DevOps tasks along all OpenSearch repos, we opted to integrate the features we had built into a GitHub App called OpenSearch-bot. This App better suits the requirements in the initial proposal from @ashwin-pc and lays the groundwork for a more streamlined, efficient workflow for contributors and maintainers.

The first beta version of the bot is planned to be released in the 1st or 2nd week of January for the OpenSearch-Dashboards team only. After successful testing with that team, a rollout will be started to release the bot to all OpenSearch teams. Also, we plan to bring a presentation for the monthly community meetings to begin branding this tool in the community.

We are very excited about this tool we have developed during the OSCI programme. We hope the OpenSearch community finds it useful and starts using it or contributing to it, either to increase their productivity or add more functionalities to automate any DevOps process.

BigSamu commented 11 months ago

Hi all,

Here is an update on this project. Changes were made at last in a meeting with Aswhin. Full details in this thread with the open PR to OpenSearch-Dashbaords.

Planning for the release of the tool has been moved to next week. We will keep you updated.

Regards,

Samuel

BigSamu commented 10 months ago

Hi all,

New updates for this proposal after today's meeting detailed on this thread.

PoC for Automated Changelog and Release Notes is ready with all changes suggested from last meeting. Merge of PR [#5519](https://github.com/opensearch-project/OpenSearch-Dashboards/pull/5519 from OSD and roll-out of new tool in same repo has been scheduled for next Thursday, January 18th.

Regards,

Samuel

BigSamu commented 10 months ago

Hi all,

From this week meeting, roll-out for OSD has started for the PoC of the automated Changelog and Release Notes project. At the end, it was decided to work on a new branch called in OSD (feature/changelog) to execute this rollout and check any errors or issues during it.

Next week more updates about the project.

Regards,

Samuel

opensearch-project / .github