Open peterzhuamazon opened 2 months ago
Wondering if it would be too many labels to transfer issues from one repo to another. How about somethings like @app please transfer this issue to security repo
Only when the app is mentioned, the event can be triggered making it push based model? We don't have to keep monitoring the label events. Also most maintainers currently tag admins in similar fashion for the transfer. So wouldn't be much of a change from user perspective.
Wondering if it would be too many labels to transfer issues from one repo to another. How about somethings like
@app please transfer this issue to security repo
Only when the app is mentioned, the event can be triggered making it push based model? We don't have to keep monitoring the label events. Also most maintainers currently tag admins in similar fashion for the transfer. So wouldn't be much of a change from user perspective.
Thanks for comment. Note that the label transfer method is just a POC and we have since reviewed it and decided to use other methods. The section above is to show case what the app is capable of, will definitely improve the functions later.
agree with @gaiksaya on too many labels. I have a few questions:
agree with @gaiksaya on too many labels. I have a few questions:
1. Is the code logic written in app? Where can I see the code for PoC? 2. Is it always listening to all the github events across all the repos? 3. wrt auto merge of backport prs, would it still require approval or not?
Hi @rishabh6788 ,
Thanks for commenting.
Thanks.
I am really excited to see a bot that does all of this!
A few things that are important.
@peterzhuamazon I recommend moving whatever you have in a private repo to public as early as possible and starting with a very simple workflow so we can ensure that (1) and (2) are robust enough from the beginning.
I am really excited to see a bot that does all of this!
A few things that are important.
1. One should be able to contribute workflows/tasks to the bot easily. So code organization needs to have a single workflow that can be authored independently from other workflows. 2. The blast ratio of this bot are huge. It will have r/w admin-level access to the org. So we need a robust set of tests.
@peterzhuamazon I recommend moving whatever you have in a private repo to public as early as possible and starting with a very simple workflow so we can ensure that (1) and (2) are robust enough from the beginning.
Thanks @dblock for the suggestions here.
Yeah, we plan to get the source code out in a new repo soon. As of now, the app is running with the poc code, and only focus on adding issues to project. We should soon break the code into different layers of abstractions/objects so it can be easily maintained and extended on.
Thanks!
Thanks for the reply @peterzhuamazon, looks promising and excited to see what this app can do for us.
Not sure if it has already been considered, can we have an app that listens to all the major events, push
, pull_request
, issue_comment
and label
. The ones that are not in-scope of this project can be a no-op
and we can implement logic for the ones we want. For e.g. for transferring we have an action performed when an admin or repo-maintainer adds a comment, such as transfer: <destination-repo>
and the app just transfers the issue.
The same can be extended in future to probably add CHANGELOG entries to pull_requests across all repos that opt-in and other actions that you have already mentioned.
Also, it would be great if before the code PoC we can have a high-level design review on how the app design would look like and what all components are involved in its functioning. I have basic queries like, where this app will be running, will it be pull based or webhook based implementation etc.
Agree to @rishabh6788's suggestion. A global app that does everything. I believe more than listening (pull based model), push based model would be great where we tag the app and then the action item to carry. For example:
@opensearch-ci-infra Transfer this issue to foo repo
@opensearch-ci-infra Add Changelog entry
@opensearch-ci-infra Run the performance test on this PR
In this way, you don't have to keep listening but just act on mention
based events. We can make it simple by just keeping it comment based. All action items can follow the same pattern. It would be great to take the scaling and increasing scope into consideration. Starting small and incrementally increasing the AI would be the way to go.
Hi @rishabh6788 @gaiksaya ,
The current framework I am building is already taking care of the listen step, we do not need to worry about that implementation.
We can, specify exactly what we are listening so we are not overwhelmed by events.
Also, this is a global app framework, but each action would have its own listener, so they are not conflict and step over each other, so it can do all the things at the same time, while not compromise the performance here.
We do, however, would be limited by the github app quota, if all the actions are run under the same app id. If needed, we can create more github app entities, and wrap all under the same umbrella, and they can act coordinately together on our defined actions.
Thanks.
My recommendation was contradictory to this. Listening eventually will become overwhelming and result in lots of API calls across org. Was suggesting to use push based model where an app will act only when tagged (mentioned in GitHub terms). We do however need to have robust checks as to who is mentioning the app but that is doable as well. Might need some more research on this front but would be much easier as comments are easy than labels/events which are permission based and might need time to be adapted by the community. Based on the requirements and asks, it would be great to see the overall design with pros and cons so that we can make a call accordingly.
My recommendation was contradictory to this. Listening eventually will become overwhelming and result in lots of API calls across org. Was suggesting to use push based model where an app will act only when tagged (mentioned in GitHub terms). We do however need to have robust checks as to who is mentioning the app but that is doable as well. Might need some more research on this front but would be much easier as comments are easy than labels/events which are permission based and might need time to be adapted by the community. Based on the requirements and asks, it would be great to see the overall design with pros and cons so that we can make a call accordingly.
We are not just based on labels, all the event we can monitor. The probot framework provide multiple ways to interact so we dont need to implement it ourselves. As the framework is based on different actions, each action can be its own instance so there is no overwhelm cases. If the action is too big just split it into more fine-grained actions.
The code is now public for review: https://github.com/opensearch-project/automation-app
Thanks.
[RFC] Building a GitHub Automation App for OpenSearch GitHub Org
Introduction
As the OpenSearch organization continues its journey towards a more open and transparent future, we have faced several operational challenges that require manual steps by OpenSearch-Project Admins. As OpenSearch is moving towards to a foundation model that focused on scalability, efficiency, and transparency, we want to solve these challenges with a permanent solution. This RFC proposes developing a GitHub Automation App specifically for the OpenSearch GitHub Organization to handle automation tasks on behalf of admins going forward.
Motivation
The OpenSearch project wants to create a transparent, open-source, and community focused development model. However, the manual processes we apply to manage the repositories in our GitHub Orgnization are not efficient enough to scale. For instance, if an issue is opened in the wrong repository, the repo maintainers must tag the opensearch-project/admins group in a comment and wait for someone to manually transfer the issue. Similarly, if PR authors forget to add a label that triggers a specific GitHub Action, a repo maintainer must step in, which further delays the review process. These kind of issues happen regularly. In order to address the challenges, we have created a detailed Problem Statement section that outlines the specific issues we face and proposed solutions based on the GitHub Automation App. By automating key tasks through the App, we can enhance the efficiency of the repository management, reduce the dependencies on admins/maintainers, and build a more seamless collaboration environment.
Tenets
Problem Statement
1. Issue Management Automation
@opensearch-project/admin
group, and an admin must manually transfer the issue. This process is time-consuming and can lead to delays in issue resolution.The App will automate issue management tasks, such as transferring issues between repositories, assigning RFC/Meta issues to the roadmap project board, auto-merging backport PRs when GitHub Checks pass, etc. Make sure issues are efficiently and consistently handled.
2. Label Management and Documentation Support
The App will enforce the use of labels, such as
need-documentation
, and automate the addition of labels based on user requirements. The label will then trigger issue creations on documentation-website repository. This will ensure that documentation requirements are raised early for quick follow-up before the release process happens.3. Permission and Access Control Management
The App will automate access control management by moving departing members to an Emeritus section, removing their access, and making PRs and announcements. When a new maintainer gets nominated, the app will assess their contributions and start a community voting thread, ensuring that access control is properly managed.
4. Executor of Metrics Project
The App will serve as the frontend executor of the metrics cluster, acting as a bridge between users and the backend metrics cluster. It will provide an interface to display useful metrics on demand and showcase important information during release phase. We can think of other use cases as well.
5. Bulk Operations Across Multiple Repositories
The App will enable bulk PR creation across multiple repositories. This will simplify the process of implementing organization-wide changes, ensuring consistency and reducing the time required to manage multi-repo updates.
Proof of Concept
In the past few months, we have made a few proof of concept GitHub Apps to tackle issue management automation.
Auto Issue Transfer:
Transfer to Repo B
to the issueissues.labeled
event on the issue, verify label content, and identify that Repo B as destinationAdd RFC/Meta Issues to Roadmap Project:
RFC
orMeta
label upon issue creationissues.labeled
event on the issue, verify label content as eitherRFC
orMeta
Roadmap:Security
since the issue is related to Securityissues.labeled
event on issue, verify label content, and identify thatSecurity
as field entryOpenSearch Roadmap
field, with the value of the field beingSecurity
Opportunities
While the RFC outlines several key places in which the App can help improve the repo management, we think that there could be additional opportunities for enhancement. We invite the community to propose more use cases and features that could be added into the GitHub Automation App.
Things to consider:
We encourage the community to provide feedbacks and suggestions by commenting on this RFC issue. Let us know what you think.
Next Steps
We will go ahead and create Meta issues and Design Proposals to the public, and start working on the App based on the aforementioned Proof of Concept.
Conclusion
By automating key tasks with the GitHub Automation App, we could further reduce manual intervention, formalize process, and improve transparency. The app will create a more efficient, scalable, and contributor-friendly environment.
Thanks for reading.