Open pacoxu opened 2 months ago
In this a scope-limited form, this should not be hard to write. Get a GH webhook call about a merge, inspect the PR, send a notification (we already have Slack client code in Prow).
Notification systems love to grow in scope, though :thinking: Some people will almost certainly want notifications about other events. Some people will almost certainly want notifications to different destinations than Slack. Would we need per-org / per-repo configuration surface? Like disabling the notifications for certain repos, or routing the notifications to different slacks based on repos?
Given the current state of Prow maintenance, I think we should prefer building a very simple, single-purpose thing rather than trying to come up with a new generic notification system.
One alternative could be something that simply publishes interesting metrics through Prometheus and then we could set up standard alertmanager-backed alerts for events like this.
One alternative could be something that simply publishes interesting metrics through Prometheus and then we could set up standard alertmanager-backed alerts for events like this.
This seems to be a better solution to me, as the cncf-cla: no
is not a general case and only occurs several time a year.
If we want to go this direction, is there an example to add an alert policy?
@dims @BenTheElder which solution do you prefer?
@pacoxu see example in https://kubernetes.slack.com/archives/CJH2GBF7Y/p1724136124406689
An alert like that looks like a great thing to have
We shouldn't rely on the label, the status is the source of truth.
Statuses can change after PRs merge, so the alert is only a best effort warning in any case.
If we want to be 100% sure, the CNCF CLA tooling could run an audit mode and scan all commits in the repo against their CLA database (which would surely be cheaper than us attempting to scan the github API for commit statuses).
For a weak check:
Given the current state of Prow maintenance, I think we should prefer building a very simple, single-purpose thing rather than trying to come up with a new generic notification system.
Yes, I think we could reuse some tooling from the slack alerts but I think it should be a single purpose plugin if we implement this, to keep things as simple as possible.
I think the problem with using a metric is then we still need to track down where the bad merge happened, and now we have large cardinality on the metric if we're recording it there. Versus if we just push out a slack alert (and/or completely offload this to CNCF auditing).
(Poking some CNCF folks about the subject ...)
link to https://github.com/kubernetes/community/issues/7800
EDITED: as unrelated
I disagree strongly with the linked issue and I think that's an unrelated topic for now.
https://kubernetes.slack.com/archives/C01672LSZL0/p1726321799185879
A prow plugin that can support sending alert to slack when a PR with
cncf-cla: no
label was merged.