opensearch-project / OpenSearch

🔎 Open source distributed and RESTful search engine.
https://opensearch.org/docs/latest/opensearch/index/
Apache License 2.0
9.54k stars 1.75k forks source link

[PROPOSAL] Author CHANGELOG in each PR instead of collecting them in the last days before a release #1868

Closed dblock closed 1 year ago

dblock commented 2 years ago

What kind of business use case are you trying to solve? What are your requirements?

Release notes are of varying quality.

For example, https://github.com/opensearch-project/OpenSearch/pull/2489 is completely unreadable.

What is the problem? What is preventing you from meeting the requirements?

What are you proposing? What do you suggest we do to solve the problem or improve the existing situation?

  1. Replace authoring release notes at the end before a release with authoring release notes in each PR.
  2. Require that release notes be updated via something like danger-changelog with every change.
  3. Automate collection of release notes across all projects. https://github.com/opensearch-project/opensearch-build/issues/438 https://github.com/opensearch-project/opensearch-build/issues/698
dblock commented 1 year ago

@seraphjiang I like automation. Try this on a fork of OpenSearch (this repo), let's see how readable the result is? Enable it on main, 2.x and a released version, e.g. 1.3.5. Also can those automated release notes be edited post-fact or will they forever contain typos that developers make in commit messages? And what happens when versions change branches or two branches have the same version (happens all the time, right now plugins are struggling to increment version on main to 3.0)?

dblock commented 1 year ago

@andrross I think we want a 1:1 relationship, but we also want the changes to be humanly readable

dblock commented 1 year ago

Thinking out of box, the goal is to address release note issue, changelog if just one option. We knew we need 1) automation and 2) best practices and discipline, template to enforce good commit message, no matter we has changelog or not.

Re: 2, it's asking a lot from code reviewers to check commit messages, and for contributors to amend and force-push their changes again and again

seraphjiang commented 1 year ago

@seraphjiang I like automation. Try this on a fork of OpenSearch (this repo), let's see how readable the result is? Enable it on main, 2.x and a released version, e.g. 1.3.5. Also can those automated release notes be edited post-fact or will they forever contain typos that developers make in commit messages? And what happens when versions change branches or two branches have the same version (happens all the time, right now plugins are struggling to increment version on main to 3.0)?

tried clone the opensearch repo, unfortunately, it will reset the PR history not clone that as well.

https://github.com/seraphjiang/OpenSearch/releases

we has add release.yml in existing repo, will have a try and update here. https://github.com/opensearch-project/dashboards-anywhere

dblock commented 1 year ago

Is this how it would look? https://github.com/seraphjiang/OpenSearch/releases/tag/2.2.0

dblock commented 1 year ago

Throwing in https://www.conventionalcommits.org/en/v1.0.0-beta.2/ which could become a standard way to write commit messages. This is used in https://github.com/aws/sagemaker-python-sdk and https://github.com/aws/copilot-cli.

dblock commented 1 year ago

@kotwanikunal I am seeing that we have both Unreleased and 2.x sections in main and 2.x. Was that intentional? Should everything be unreleased? where do backports go?

seraphjiang commented 1 year ago

Is this how it would look? https://github.com/seraphjiang/OpenSearch/releases/tag/2.2.0

not exactly, but similar, the format is configuable.

kotwanikunal commented 1 year ago

@kotwanikunal I am seeing that we have both Unreleased and 2.x sections in main and 2.x. Was that intentional? Should everything be unreleased? where do backports go?

It was supposed to be Unreleased for main, 2.x for 2.x. Backporting has led to a mess and inconsistency, with folks not adding the change to 2.x. Currently Unreleased is the only one which has all the correct changes, and I was hoping to get main + backporting in line since that will bring the other branches to a correct state automatically with the helper.

seraphjiang commented 1 year ago

@dblock this is how it looks without adding labels like feature/bug/enhancement/break

https://github.com/opensearch-project/dashboards-anywhere/releases/tag/v0.8.0

here is release.yml https://github.com/opensearch-project/dashboards-anywhere/blob/main/.github/release.yml

seraphjiang commented 1 year ago

Example

  1. release without add enhancement labels to most of PR https://github.com/opensearch-project/opensearch-dashboards-functional-test/releases/tag/v2.4.0-rc

  2. release with more PR labelled with enhancement https://github.com/opensearch-project/opensearch-dashboards-functional-test/releases/tag/v2.4.0-rc2

seraphjiang commented 1 year ago

this is github's public rest api to generate release note, which we could integrate with our release automation

https://docs.github.com/en/rest/releases/releases#generate-release-notes-content-for-a-release

rursprung commented 1 year ago

@kotwanikunal I am seeing that we have both Unreleased and 2.x sections in main and 2.x. Was that intentional? Should everything be unreleased? where do backports go?

It was supposed to be Unreleased for main, 2.x for 2.x. Backporting has led to a mess and inconsistency, with folks not adding the change to 2.x. Currently Unreleased is the only one which has all the correct changes, and I was hoping to get main + backporting in line since that will bring the other branches to a correct state automatically with the helper.

tbh, that sounds wrong. the changelog for 2.x should only live on the 2.x branch - the main branch should only have the changelog for main (i.e. the moment 2.x is branched away the two changelogs start to diverge). otherwise you have a nightmare in maintaining the changelog (as you have now).

i know that i was the one who advocated for the currently used changelog and still think that the format is correct. i however don't think that the way it's currently being done is good (esp. after i've now hassled with it a few times when creating PRs):

right now i've nearly never managed to get in a PR directly: either it failed with a false reject (lots of flaky tests around; not the subject of this issue here) and/or it ran into a merge conflict due to the changelog by the time it got an approval (or the tests were re-run). since i don't work every day (and sign all my commits with a PGP key which is of course different for work & private) the PR then has to sit around for a while (usually a week) until i get around to rebasing it again, which is a huge loss of time and might mean that it doesn't land in a specific release in the worst case.

andrross commented 1 year ago

i think it's wrong to require an entry for every commit

I agree with this. Assuming the changelog is targeted towards users/operators of the software, listing every commit seems like a lot of noise. Even beyond reformating/test changes, if a release introduces a new feature, that feature may consist of dozens of commits but the relevant information for the changelog is probably just a single entry. As a developer, every commit is important but I'll use the tooling from git to navigate the commits as opposed to a changelog (and I probably wouldn't trust a hand-edited changelog anyway).

rursprung commented 1 year ago

just as a minor clarification: when i wrote "for every commit" i actually meant "for every PR". but that doesn't change a lot as larger features still consist of multiple PRs (leading to multiple entries for one new feature instead of a single entry) and there are lots of PRs which are not relevant for users/operators.

dblock commented 1 year ago

@rursprung @andrross I have a skip-changelog label in https://github.com/dblock/create-a-github-issue that could be implemented here for dependabot and backports, see https://github.com/dblock/create-a-github-issue/pull/16/files. WDYT? Anyone wants to make that change here and cleanup the CHANGELOG in main and 2.x?

dblock commented 1 year ago

@seraphjiang Want to try to PR what you're suggesting into OpenSearch main, including removing existing functionality? Let's discuss a concrete change proposal with @rursprung @kotwanikunal and @andrross? I still think we would absolutely need a way to force some standards on PR titles though like conventional commits, too.

andrross commented 1 year ago

From keepachangelog.com:

Can changelogs be bad?

Yes. Here are a few ways they can be less than useful. Commit log diffs Using commit log diffs as changelogs is a bad idea: they're full of noise. Things like merge commits, commits with obscure titles, documentation changes, etc.

I think our 1:1 requirement that all PRs have a changelog entry results in us implementing this bad practice. The changelog ends up being slightly more readable than a commit log (entries are categorized) but we have no way to filter out the commits that are not useful to users. Ultimately "useful to users" is a judgment call, so I think we need a process here that allows for the developer and maintainer to make a judgment about whether to include a changelog entry for a given PR.

ashwin-pc commented 1 year ago

Having read through as much of this thread as possible, I dont think i've seen anyone suggest this, so apologies if this was already discussed. Cant we decouple the change-log edits from the PR's responsible for the change (with some automation to make sure that tracking the two becomes easier).

  1. We add an automated label changelog to every PR that needs a changelog entry (new entry or update). It automatically creates a new issue in the repo thats assigned to the author of the PR.
  2. The new Issue also has a link to the PR, A summary of the change and other goodies to make creating a Changelog change as easy and traceable as possible.
  3. If the PR has any backport tags, the changelog issue will also have those tags called out for tracking

Pros:

Cons:

ashwin-pc commented 1 year ago

Another suggestion i want to drop in here is GitLabs approach that they documented in this blog post: https://about.gitlab.com/blog/2018/07/03/solving-gitlabs-changelog-conflict-crisis/

dblock commented 1 year ago
  1. We add an automated label changelog to every PR that needs a changelog entry (new entry or update). It automatically creates a new issue in the repo thats assigned to the author of the PR.

I think you can address the cons by adding a skip-changelog label (easily implemented in https://github.com/dblock/create-a-github-issue/pull/16 for example) to label those PRs that don't need an entry.

andrross commented 1 year ago

Here's the PR (#5067) for creating the release notes for 2.4 in OpenSearch. I manually merged entries into a single "feature" entry where appropriate, which is desirable in my opinion (breaking the 1-entry-for-every-PR constraint currently enforced). We had some duplicate/incorrect entries split between the [Unreleased] and [2.x] section of the file. Overall it wasn't too bad to clean up.

That PR was on the 2.4 branch. I believe the CHANGELOG on the 2.x branch can be emptied out now (all changes from 2.x were released into 2.4). The main branch is a mess though. It contains all changes that have been released, interleaved with changes that have not been backported and will remain unreleased until 3.0. This process is not working well with our branching strategy, specifically the fact that we're committing next-major-version changes alongside next-minor-version changes that get backported. All next-minor-version changes should not have an entry on the main branch because they will never be "unreleased" from the perspective of the next-major-version.

dblock commented 1 year ago

@andrross Yes, I agree it's not working :( Let's keep looking at @kotwanikunal to figure out best path forward until he tells us otherwise?

andrross commented 1 year ago

@dblock @kotwanikunal

I've written up an FAQ to be included in the CHANGELOG.md file in #5092

This defines a slight change in the process. First, it specifies not every PR should include a changelog entry (and assumes we'll have a skip-changelog label. Second, it defines how the changelog should work with our branching strategy. Please take a look.

andrross commented 1 year ago

Just an update after the 2.5 release...unfortunately the latest process is also not working great. The problems seemed to mostly arise around where should I put my CHANGELOG entry. In an ideal world, the [Unreleased 2.x] section on the main branch should be identical to the [Unreleased 2.x] section on the 2.x branch. The process to create release notes should be: move the [Unreleased 2.x] section on main to a release-notes-2.5.0.md file and then backport to the 2.x and 2.5 branches. In practice that did not work because some changes on main were put into the [Unreleased 3.0] section but then backported and put into the [Unreleased 2.x] section on the 2.x branch (this is easy for reviewers to miss because it is unintuitive you have to expand out the CHANGELOG file to even see what section the entry is in). @kotwanikunal ultimately ended up scanning through commits and manually auditing to create the release notes file.

nknize commented 1 year ago

I mentioned this in a PR comment and will post it here since I haven't followed this moving target closely.

I suggest we switch the changelog to adopt a more standard approach like other projects and use the issue number instead of the PR. It would enforce having an issue for each PR, and stop this burden of pushing a changelog commit that updates the url in the changelog for each PR.

dblock commented 1 year ago

@nknize Not everything has an issue, but the additional commit is actually a relatively small problem (I personally don't mind doing a git commit --amend and a force push) compared to the fact that CHANGELOG in every PR creates a ton of additional work and backport merge conflicts, even if done right. It solved one problem, but created a different one.

Another idea is to adopt a different mechanism for PR titles, and settle on something that generates CHANGELOGs in every branch automatically from PR titles that can be changed after merge. Check out https://github.com/aws/aws-cdk/pulls for what that could look like.

We also need to do this times N plugins, because the distribution release notes are all over the place in terms of quality.

nknize commented 1 year ago

Not everything has an issue

If it doesn't have an issue it doesn't need to be in the changelog.

andrross commented 1 year ago

Not everything has an issue

If it doesn't have an issue it doesn't need to be in the changelog.

I'm in favor of linking issues instead of PRs, particularly since we updated that guidance that not all changes require a CHANGELOG entry and added the skip-changelog label.

It solved one problem, but created a different one.

Honestly it didn't really even solve a problem (at least for the OpenSearch repo) for this latest release as the CHANGELOG was a bit of a mess between the two branches and required a lot of manual reconciliation to create the release notes. I've been nagging folks to make sure entries go in the right place since that release but I don't think we've arrived at a sustainable solution.

joshuarrrr commented 1 year ago

Although this proposal is already rolled out across the org, due to some of the problems it creates, we've decided to move forward with the alternative proposal https://github.com/opensearch-project/.github/issues/156 instead.