Open dbwiddis opened 3 months ago
I notice that many of these do not have passing CI :(
The version increment workflow is https://github.com/opensearch-project/opensearch-build/blob/main/.github/workflows/os-increment-plugin-versions.yml and https://github.com/opensearch-project/opensearch-build/blob/main/.github/workflows/osd-increment-plugin-versions.yml, moving this to opensearch-build.
I notice that many of these do not have passing CI :(
for many that is because an upstream dependency hadnβt bumped when CI first ran. And they donβt push/retry.
Auto-merge workflow that would automatically merge these PRs if the CI checks pass: https://github.com/opensearch-project/opensearch-build/blob/main/.github/workflows/automatic-merges.yml
Linking some approaches to rerun failing CIs. https://github.com/opensearch-project/opensearch-build/issues/2706
Following are my thoughts on having the version increment PR's auto merged.
Coming from https://github.com/opensearch-project/opensearch-build/issues/4225 Automatically run a min distribution build when a version increment is merged into OpenSearch core.
Start by having a discussion on update the code freeze for core earlier than the plugins. This way the code freeze for core is honored and everything is ready before plugin can finalize the changes. This should eliminate the last minute breaking changes and the plugins can use the finalized core artifact for version increment CI's.
Today we have an automation that creates the version increment PR's. Its upto the plugin teams to take care of the CI's related to the version increment PR's and get the PR's merged, a release manager will follow up until the PR's are merged (This needs to be automated).
For the automation we can have a workflow to ensure the plugin dependencies are build first:
Create Dependency Tree
β
Core Dependency Solved (Issue 4225)
β
ββββββββββββββββββββββββββββββββββββββββββββ
β β
Default Plugins Other Plugins
(Dependent only on core) (Dependent on core + upstream plugins)
β β
Fetch Version Increment PRs Check Dependencies
β β
Run CI Fetch Version Increment PRs for the Dependencies
β β
βββββββββββββββββ Ensure CI Passes and Build Dependencies (if needed)
β β β
CI Passed CI Failed βββββββββββββββββ
β β β β
Auto-Merge PRs Investigate & Fix CI Passed CI Failed
β β β
Retry CI Auto-Merge Investigate & Fix β Retry CI β CI Passed β Auto-Merge PRs
β PRs β
CI Passed β β
Auto-Merge PRs Once merged, build the artifacts
β
Re-run the (current) plugin CI's since Dependencies are merged and built
β
βββββββββββββββββ
β β
CI Passed CI Failed
β β
Auto-Merge Investigate & Fix
β β
End Retry CI
β
CI Passed
β
Auto-Merge PRs
β
End
Adding @gaiksaya @dblock @getsaurabh02 @dbwiddis @peterzhuamazon
Thanks @prudhvigodithi for this detailed graph.
The automation app should become handy when trying to compose all the information and decision in one place, while access to all the repos at once with admin permissions.
We can use the input manifest to find the dependency tree between plugins, and decide on which plugins to take care before others. If a plugin is a dependency of the other, and it failed the checks, it should be hard block before moving to the next plugin.
We can then use metrics cluster and release dashboards to know where we are and even compose a dependency tree as a pre-requisite as the entry criteria.
Please let me know what you think.
Thanks.
All we need are passing CIs. Something like this https://github.com/opensearch-project/opensearch-migrations/pull/940#issuecomment-2338884798 from bot's perspective for hard merges (which we should not) or have any auto-merge workflow added like commented above.
For CIs that have expired run (I believe after 30 days) opening and closing the PR should help.
Thanks @gaiksaya and @peterzhuamazon for your inputs.
@gaiksaya for this All we need are passing CIs.
we need the dependencies to be build 1st and for this we need the dependency version increment's to be merged before we can build the dependencies, once we have this yes I'm fine with the flow on re-trying and merging. What @peterzhuamazon added is also a good idea to leverage bot and metrics cluster if it can make the automation easy :).
Thanks Both.
I feel like github actions is suited for individual workflow but not really suited to combined actions among multiple repos. Especially when we already have so much plugins, it is not easy to update all the repos if there is any changes, or need a centralized call on what to proceed next.
Therefore I was raising the use of automation app combined with metrics cluster to do the hard work and easier to maintain over time.
Thanks.
I don't believe app would have the permission to re-run CIs. @prudhvigodithi I agree we need dependent components to build first. But opening and closing the PR once a day till they get merged is one of the solution to re-run the CIs. If the dependent components are ready by then, great! Else try again tomorrow.
I don't believe app would have the permission to re-run CIs. @prudhvigodithi I agree we need dependent components to build first. But opening and closing the PR once a day till they get merged is one of the solution to re-run the CIs. If the dependent components are ready by then, great! Else try again tomorrow.
The app does have full access to re-run and trigger workflows:
Workflows, workflow runs and artifacts
Thank you @dbwiddis for the proposal.
Hi @peterzhuamazon , @gaiksaya , @prudhvigodithi , @getsaurabh02: How should we proceed with this forward? We did see with 2.18 version bump as well where until and unless upstream have taken care of version bump, dependency plugins are unable to do theirs.
Hi @peterzhuamazon Does it make sense to move this issue to https://github.com/opensearch-project/automation-app ? We can start with merging PRs that have passing CIs and then iterate on the logic for re-running the CI's timely, etc. WDYT?
Hi @peterzhuamazon Does it make sense to move this issue to https://github.com/opensearch-project/automation-app ? We can start with merging PRs that have passing CIs and then iterate on the logic for re-running the CI's timely, etc. WDYT?
I think it does make sense tho we need to understand the scale of the problem. We might need to create an additional app just to handle all PR related activities.
Thanks.
Hi @peterzhuamazon Does it make sense to move this issue to https://github.com/opensearch-project/automation-app ? We can start with merging PRs that have passing CIs and then iterate on the logic for re-running the CI's timely, etc. WDYT?
I think it does make sense tho we need to understand the scale of the problem. We might need to create an additional app just to handle all PR related activities.
Thanks.
Thanks! We can use this issue as a problem statement for the same. Coming from https://github.com/opensearch-project/opensearch-build/issues/5171 auto-merging of the version bump PRs also applies to the core repos. See this comment. Instead of implementing an individual solution for both core repos, I think having a generic one would be great. Moving this issue to automation-app repo to discuss the in and out of scope requirements, approach, and implementation of the same.
What/Why
What are you proposing?
We should enable repo/admin-level automation to merging branch version bump PRs because maintainers aren't doing it.
What users have asked for this feature?
I requested it in the 2.14.0 retrospective.
I have repeatedly nagged other plugin maintainers to merge upstream dependencies so I can close mine.
I have received pushback from other plugin maintainers for the above nagging, blocking my efforts to actually close open PRs on repos that I maintain.
Other mentions of automation or enforcement of version bumping:
What problems are you trying to solve?
This, on a repo I maintain (source).
And this on an upstream dependency, blocking me from merging mine (source).
And this on another upstream dependency, blocking me from merging mine (source).
Organization-wide,
opensearch-project
has 204 unmerged version bump PRs.These are literally mouse clicks away from being closed, but it takes upstream repos to lead the way.
What is the developer experience going to be?
For maintainers like me, the ability to merge PRs on their repo because upstream repositories have appropriate versioning.
For maintainers who don't care about these PRs, they won't have to lift a finger. How awesome is that?
More seriously, minor version bump PRs are assumed to have been cut as part of automation (Autosync) and multiple developers spent multiple hours dealing with the aftermath of a branch cut prior to a version bump in the 2.14 release. That's at least an hour of my time wasted, multiplied by at least 4 other developers.
Are there any security considerations?
Patch version bumps are one of the biggest culprits here, because patch releases rarely happen.
However, when they do, it's usually for a very important issue that can't wait until the next release cycle. In this case, having plugins wait for multiple upstream dependencies to merge their patch version bumps could slow our ability to react to these.
Are there any breaking changes to the API
Nope.
What is the user experience going to be?
Seeing plugin repositories that are well-maintained without a huge backlog of ignored PRs that discourage them from contributing.
Are there breaking changes to the User Experience?
Nope.
Why should it be built? Any reason not to?
It should be built because automation already exists to create the PRs; they can be auto-merged by a bot with appropriate powers to do so, when all dependencies are met.
It should be built to save the time and effort of maintainers to do the same. In particular, GitHub action retries expire after 30 days, requiring a minute or so of effort to re-try a version bump PR after a month of upstream repos ignoring it... or similar effort to search and identify whether it's able to be merged. It's a distraction and complete waste of developer time.
What will it take to execute?
Whatever automation creates the PRs can be given the power to merge them if CI checks are gren.
Any remaining open questions?
Why don't we just require maintainers to do this? Actually, we do, in Release Checklists, which specify merging these version bumps as part of the checklist. There are 453 open Release Checklist issues and while 3.0.0 and 2.17.0 are a couple hundred of those, there are far more than that.