kyma-project / test-infra

Test infrastructure for the Kyma project.
https://status.build.kyma-project.io/
Apache License 2.0
38 stars 181 forks source link

POC Automatic Approval Service #7285

Closed tobiscr closed 3 weeks ago

tobiscr commented 1 year ago

Description

My moving away from a central repository for Kyma, the teams will maintain their repositories by themself. Also the central service of automated approvals will not be available in the future.

The alternative option, to use Github actions for auto-approval flows, will be disabled or for security reasons.

Goals of this POC is to offer an option for teams, to enable automated auto-approval mechanisms which allow:

AC:

dekiel commented 1 year ago

Proposal of PR reviews automation which doesn't relay on Prow approval flow, so doesn't require removing write permissions on repositories. #4786 Implementation for proposal #7017 Automation is running for test-infra repo for all PRs created by dependabot and PRs with changes in ^docs/index.md file.

Ressetkk commented 1 year ago

Im going to copy-paste my comment from #7017 for better visibility... I would like to see detailed comparison between Tide queries and this approach. That includes:

Differences in configuration Limitations of an in-house solution Required changes in branch protection rules What needs to be worked around with clear explanation why it has to be done that way I don't like that you expect a bot user to be added as an owner for the code. Bots should not be the owners of the files, but rather operate and execute actions on specific settings. With this approach we are losing the ability to merge PRs without any of the supervision and review as we have to be in compliance with branch protection rules.

That means if you want to for example adjust query to automatically push bumps without PR you have to create fake status for required contexts (tide and cal-assistant). This should not be the case.

If your idea comes live the way it is right now, you have to maintain at least 2 bots for every team. One that creates PRs and the other that approves them. Given that we have 6-7 team that means additional 14 account that you have to manage the PATs. Every on of them has to be added to the CODEOWNERS file, that adds additional overhead and micro management.

I don't like the idea. It's a workaround to the problem that we created ourselves and doesn't solve the problem of lack of automation that GitHub offers in free OSS organization tier.

tobiscr commented 1 year ago

We got 3 frontrunners for this new feature:

Ressetkk commented 1 year ago

Consolidated my messages from Slack as counter-suggestion:

Right now:

Image

I just feel that if we dropped the security measures like branch protection, CODEOWNERS and keep permissions only for maintainers (teams) that would also work for us. If someone breaks anything that'd be on the team maintaining the repo... I don't want write access to repos that I'm not maintainer of, yet right now I have write access to anything... Yikes.

Maybe a way to go is just to accept the risk but reduce it only to actual maintainers of the repo. Like us only responsible for test-infra we can have write access there and agree that we don't touch anything that bot manages. Other teams would have their write access revoked so it'd be only up to us to break something.

If you let maintainers full access to the repo, they can do whatever automations they like without much effort from us. Externals (other teams) will still have to open PR with changes since their write access will be revoked.

With accepted risk, the actual cost of implementation and maintain would go way down drastically. Developers still have access to UI elements but on an agreement that they don't touch anything that bot manages - specific labels, PR merges and status checks. Risk is reduced to actual maintainers by revoking Write access to people not involved in the development for each repo. Externals can use chatops to create and label issues. They don't use it as board. We get back tools for auto label PRs, auto merges built in Prow and we don't need to maintain shitload of accounts for each team.

And it goes down to:

To conclude Prow which can handle code ownership very well for anyone in org and actually it is something that works well for another SAP product - Gardener (see https://github.com/gardener/gardener)

Reducing the number of people with write access for is:

Pros:

Cons:

That's why I believe leaving write access to maintainers of the repos but accepting that they are responsible for any breaks, and restricting it for anyone else not affiliated with their board is a good middle ground. You get assurance that your board issues are not tampered by anyone else but your team. Externals will have to deal with it, but maintainers would be the only ones that could push directly anything. Everyone else would be forced to use PRs for everything.

It is something that should be proposed on organisation meeting, but we must have our team-wide agreement that we want to show it.

tobiscr commented 1 year ago

Please verify the process in the description. I fear the discussion is mostly focusing on the auto-merge part which is just one piece of three we would like to address in this solution:

1) Auto PR creation by defining triggers (e.g. triggers can be: PR includes a change of a particular file, PR was created by particular person etc. ) 2) Auto approval (approval happens also based on defined rule likes: creator of the PR, particular labels are defines, PR includes changes for particular files etc.) 3) Auto-Merge can be disabled (it's expected that auto-merge is per default enabled and works by using Tide, we want to provide also an option to block auto-merges - again based on rules which can be configured)

Ressetkk commented 1 year ago

@tobiscr it goes down to the implementation. Both options are meant to do the same because:

  1. Auto PR creation is possible regardless of the auto-merge flow. It's something that is supposed to just create a PR with some logic. Every repo maintainer can do their own automations there be it in ProwJob, or by their own microservice.
  2. Auto approval of PRs is needed only because of the limitation that branch protection rules enforce. By dropping the rules, you make this point invalid, because there is no rule that require approval before merge.
  3. Auto-merge is already managed by Tide. It can also be disabled by simply excluding repo from its query. You can define microservice automations that can be used to block some PRs, if the conditions aren't met. See our needs-tws plugin, which was blocking PR merge by setting up do-not-merge label which was used by Tide to block the PR. Moreover, you can define tide query to block merges of MD files, but not from bot user, by simply adding separate query for bot as author - this is the flow where you want your MD files to be reviewed, but not for bots.

Tide manages all PRs automerges by periodically invoking pre-defined queries in its config. It contains:

Also with tide and it's powerful queries you can secure release process, and enforce approval from specific people, like release managers. Just define separate query for branch release-+d\.+d that enforces every PR needs to have a release-specific label, add a Prow plugin that requires approval from release manager GH group, and you are basically done.

If you are interested, tide documentation provides pretty much comprehensive flowchart how it merges PRs https://docs.prow.k8s.io/docs/components/core/tide/

Additionally for more information about configuring tide, see https://docs.prow.k8s.io/docs/components/core/tide/config/ The most vital parts are query field description.

We had this flow actually but went back from it when we restricted write access and everyone was complaining because of non-working features in boards. All PRs regarding autobumps of Prow and test-infra job images, and even I enabled dependabot to be automerged this way, were merged automatically without any human interactions, without additional bot approvals, and other additional tools that we had to maintain. It was easily configurable through Prow's config and didn't cost us much. It will be super easy to incorporate it if we consider accepting risks that I mentioned before. Additionally we had other nice QoL automations that devs liked - autolabeling PRs based on the directory they touched.

If you really want to have security at highest point, creating separate central repository (backlog) only for issues where everyone have write access and locking down repos enforcing PR flow on everyone except bots. It is probably the best, most secure and robust way for automations, automerges, direct pushes and so on. But it's the change that requires even more work because transferring issues will take some time, but should be doable. Have fun pushing this through the rest though...

I don't know what else I can say to make my point valid and be even considered. It works for Gardener - SAP OSS team, like kyma, RedHat, Knative, Istio, K8s, Tekton...

dekiel commented 1 year ago

We got 3 frontrunners for this new feature:

  • Tunas (create PRs)
  • Busola (auto-approve PRs)
  • Jellyfish (create and approve PRs for auto-bumping)

On a meeting with Tunas we got following feedback from @k15r .

kyma-bot commented 1 year ago

This issue or PR has been automatically marked as stale due to the lack of recent activity. Thank you for your contributions.

This bot triages issues and PRs according to the following rules:

You can:

If you think that I work incorrectly, kindly raise an issue with the problem.

/lifecycle stale

tobiscr commented 1 year ago

By moving towards the service provider structure, we should challenge how this feature can bei implemented and will be consumable by our teams.

dekiel commented 11 months ago

Made a tool used for our team a production ready.

kyma-bot commented 9 months ago

This issue or PR has been automatically marked as stale due to the lack of recent activity. Thank you for your contributions.

This bot triages issues and PRs according to the following rules:

You can:

If you think that I work incorrectly, kindly raise an issue with the problem.

/lifecycle stale

github-actions[bot] commented 7 months ago

This issue has been automatically marked as stale due to the lack of recent activity. It will soon be closed if no further activity occurs. Thank you for your contributions.

Sawthis commented 7 months ago

@TorstenD-SAP in my opinion we should aim for something based on the Github features in the future like e.g https://github.com/marketplace/actions/auto-approve maintained by the teams. And https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/incorporating-changes-from-a-pull-request/automatically-merging-a-pull-request could be a replacement for Prow tide component.

github-actions[bot] commented 1 month ago

This issue has been automatically marked as stale due to the lack of recent activity. It will soon be closed if no further activity occurs. Thank you for your contributions.

github-actions[bot] commented 3 weeks ago

This issue has been automatically closed due to the lack of recent activity. /lifecycle rotten