opensearch-project / automation-app

πŸ€– An automation app to handle the daily activities of your GitHub Repository.
Apache License 2.0
1 stars 4 forks source link

[PROPOSAL] Automate merging of branch version bump PRs #37

Open dbwiddis opened 3 months ago

dbwiddis commented 3 months ago

What/Why

What are you proposing?

We should enable repo/admin-level automation to merging branch version bump PRs because maintainers aren't doing it.

What users have asked for this feature?

I requested it in the 2.14.0 retrospective.

I have repeatedly nagged other plugin maintainers to merge upstream dependencies so I can close mine.

I have received pushback from other plugin maintainers for the above nagging, blocking my efforts to actually close open PRs on repos that I maintain.

Other mentions of automation or enforcement of version bumping:

What problems are you trying to solve?

This, on a repo I maintain (source). Screenshot 2024-08-08 at 10 46 48β€―PM

And this on an upstream dependency, blocking me from merging mine (source). Screenshot 2024-08-08 at 10 47 20β€―PM

And this on another upstream dependency, blocking me from merging mine (source). Screenshot 2024-08-08 at 10 49 17β€―PM

Organization-wide, opensearch-project has 204 unmerged version bump PRs.

These are literally mouse clicks away from being closed, but it takes upstream repos to lead the way.

What is the developer experience going to be?

For maintainers like me, the ability to merge PRs on their repo because upstream repositories have appropriate versioning.

For maintainers who don't care about these PRs, they won't have to lift a finger. How awesome is that?

More seriously, minor version bump PRs are assumed to have been cut as part of automation (Autosync) and multiple developers spent multiple hours dealing with the aftermath of a branch cut prior to a version bump in the 2.14 release. That's at least an hour of my time wasted, multiplied by at least 4 other developers.

Are there any security considerations?

Patch version bumps are one of the biggest culprits here, because patch releases rarely happen.

However, when they do, it's usually for a very important issue that can't wait until the next release cycle. In this case, having plugins wait for multiple upstream dependencies to merge their patch version bumps could slow our ability to react to these.

Are there any breaking changes to the API

Nope.

What is the user experience going to be?

Seeing plugin repositories that are well-maintained without a huge backlog of ignored PRs that discourage them from contributing.

Are there breaking changes to the User Experience?

Nope.

Why should it be built? Any reason not to?

It should be built because automation already exists to create the PRs; they can be auto-merged by a bot with appropriate powers to do so, when all dependencies are met.

It should be built to save the time and effort of maintainers to do the same. In particular, GitHub action retries expire after 30 days, requiring a minute or so of effort to re-try a version bump PR after a month of upstream repos ignoring it... or similar effort to search and identify whether it's able to be merged. It's a distraction and complete waste of developer time.

What will it take to execute?

Whatever automation creates the PRs can be given the power to merge them if CI checks are gren.

Any remaining open questions?

Why don't we just require maintainers to do this? Actually, we do, in Release Checklists, which specify merging these version bumps as part of the checklist. There are 453 open Release Checklist issues and while 3.0.0 and 2.17.0 are a couple hundred of those, there are far more than that.

dblock commented 3 months ago

I notice that many of these do not have passing CI :(

dblock commented 3 months ago

The version increment workflow is https://github.com/opensearch-project/opensearch-build/blob/main/.github/workflows/os-increment-plugin-versions.yml and https://github.com/opensearch-project/opensearch-build/blob/main/.github/workflows/osd-increment-plugin-versions.yml, moving this to opensearch-build.

dbwiddis commented 3 months ago

I notice that many of these do not have passing CI :(

for many that is because an upstream dependency hadn’t bumped when CI first ran. And they don’t push/retry.

gaiksaya commented 3 months ago

Auto-merge workflow that would automatically merge these PRs if the CI checks pass: https://github.com/opensearch-project/opensearch-build/blob/main/.github/workflows/automatic-merges.yml

zelinh commented 3 months ago

Linking some approaches to rerun failing CIs. https://github.com/opensearch-project/opensearch-build/issues/2706

prudhvigodithi commented 2 months ago

Following are my thoughts on having the version increment PR's auto merged.

A rough workflow explaining the above points:

             Create Dependency Tree
                      ↓
              Core Dependency Solved (Issue 4225)
                      ↓
   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
   ↓                                          ↓
Default Plugins                              Other Plugins
(Dependent only on core)                     (Dependent on core + upstream plugins)
   ↓                                          ↓
Fetch Version Increment PRs                Check Dependencies
   ↓                                          ↓
Run CI                                     Fetch Version Increment PRs for the Dependencies
   ↓                                          ↓
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”                          Ensure CI Passes and Build Dependencies (if needed)
↓               ↓                                  ↓
CI Passed    CI Failed                    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
   ↓               ↓                    ↓                 ↓
Auto-Merge PRs   Investigate & Fix     CI Passed       CI Failed
                    ↓                   ↓               ↓
                 Retry CI            Auto-Merge     Investigate & Fix β†’ Retry CI β†’ CI Passed β†’ Auto-Merge PRs
                    ↓                   PRs             ↓
                 CI Passed              ↓               ↓
               Auto-Merge PRs       Once merged, build the artifacts
                                           ↓
                                  Re-run the (current) plugin CI's since Dependencies are merged and built
                                           ↓
                                      β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                                      ↓               ↓
                                    CI Passed       CI Failed
                                      ↓               ↓
                                  Auto-Merge       Investigate & Fix
                                      ↓               ↓
                                     End           Retry CI
                                                    ↓
                                                 CI Passed
                                                    ↓
                                             Auto-Merge PRs
                                                    ↓
                                                   End

Adding @gaiksaya @dblock @getsaurabh02 @dbwiddis @peterzhuamazon

peterzhuamazon commented 2 months ago

Thanks @prudhvigodithi for this detailed graph.

The automation app should become handy when trying to compose all the information and decision in one place, while access to all the repos at once with admin permissions.

We can use the input manifest to find the dependency tree between plugins, and decide on which plugins to take care before others. If a plugin is a dependency of the other, and it failed the checks, it should be hard block before moving to the next plugin.

We can then use metrics cluster and release dashboards to know where we are and even compose a dependency tree as a pre-requisite as the entry criteria.

Please let me know what you think.

Thanks.

gaiksaya commented 2 months ago

All we need are passing CIs. Something like this https://github.com/opensearch-project/opensearch-migrations/pull/940#issuecomment-2338884798 from bot's perspective for hard merges (which we should not) or have any auto-merge workflow added like commented above.

For CIs that have expired run (I believe after 30 days) opening and closing the PR should help.

prudhvigodithi commented 2 months ago

Thanks @gaiksaya and @peterzhuamazon for your inputs.

@gaiksaya for this All we need are passing CIs. we need the dependencies to be build 1st and for this we need the dependency version increment's to be merged before we can build the dependencies, once we have this yes I'm fine with the flow on re-trying and merging. What @peterzhuamazon added is also a good idea to leverage bot and metrics cluster if it can make the automation easy :).

peterzhuamazon commented 2 months ago

Thanks Both.

I feel like github actions is suited for individual workflow but not really suited to combined actions among multiple repos. Especially when we already have so much plugins, it is not easy to update all the repos if there is any changes, or need a centralized call on what to proceed next.

Therefore I was raising the use of automation app combined with metrics cluster to do the hard work and easier to maintain over time.

Thanks.

gaiksaya commented 2 months ago

I don't believe app would have the permission to re-run CIs. @prudhvigodithi I agree we need dependent components to build first. But opening and closing the PR once a day till they get merged is one of the solution to re-run the CIs. If the dependent components are ready by then, great! Else try again tomorrow.

peterzhuamazon commented 2 months ago

I don't believe app would have the permission to re-run CIs. @prudhvigodithi I agree we need dependent components to build first. But opening and closing the PR once a day till they get merged is one of the solution to re-run the CIs. If the dependent components are ready by then, great! Else try again tomorrow.

The app does have full access to re-run and trigger workflows: Workflows, workflow runs and artifacts

minalsha commented 2 months ago

Thank you @dbwiddis for the proposal.

Hi @peterzhuamazon , @gaiksaya , @prudhvigodithi , @getsaurabh02: How should we proceed with this forward? We did see with 2.18 version bump as well where until and unless upstream have taken care of version bump, dependency plugins are unable to do theirs.

gaiksaya commented 2 weeks ago

Hi @peterzhuamazon Does it make sense to move this issue to https://github.com/opensearch-project/automation-app ? We can start with merging PRs that have passing CIs and then iterate on the logic for re-running the CI's timely, etc. WDYT?

peterzhuamazon commented 2 weeks ago

Hi @peterzhuamazon Does it make sense to move this issue to https://github.com/opensearch-project/automation-app ? We can start with merging PRs that have passing CIs and then iterate on the logic for re-running the CI's timely, etc. WDYT?

I think it does make sense tho we need to understand the scale of the problem. We might need to create an additional app just to handle all PR related activities.

Thanks.

gaiksaya commented 2 weeks ago

Hi @peterzhuamazon Does it make sense to move this issue to https://github.com/opensearch-project/automation-app ? We can start with merging PRs that have passing CIs and then iterate on the logic for re-running the CI's timely, etc. WDYT?

I think it does make sense tho we need to understand the scale of the problem. We might need to create an additional app just to handle all PR related activities.

Thanks.

Thanks! We can use this issue as a problem statement for the same. Coming from https://github.com/opensearch-project/opensearch-build/issues/5171 auto-merging of the version bump PRs also applies to the core repos. See this comment. Instead of implementing an individual solution for both core repos, I think having a generic one would be great. Moving this issue to automation-app repo to discuss the in and out of scope requirements, approach, and implementation of the same.

dblock commented 1 week ago

[Catch All Triage - 1, 2, 3, 4, 5]