grafana / grafana-operator

An operator for Grafana that installs and manages Grafana instances, Dashboards and Datasources through Kubernetes/OpenShift CRs
https://grafana.github.io/grafana-operator/
Apache License 2.0
863 stars 384 forks source link

feat: add support for notification policies #1610

Closed theSuess closed 1 month ago

theSuess commented 1 month ago

This PR adds support for Notification Policies.

I've recreated the Route struct to support Kubernetes based validation.

One concern is that only one Notification Policy can be applied to an instance. If a user creates two notification policies which match the same instance, they will flip between each other. I have not yet come up with a clever (and performant) safeguard against this.

NissesSenap commented 1 month ago

Sitting in the car reading the PR so not too easy to do research, so just thinking out loud.

I guess the natural solution is to add validation webhook support to the operator, which is a pain.

Could another option be to do something with status on the grafana instance? When the reconciler adds the initial notification policy, write a status. And if a new policy gets added, error out and write a status on the new policy? I guess similar logic would be needed in the validation webhook, the big difference is that it wouldn't be triggered for each reconciliation.

theSuess commented 1 month ago

That's a good idea! Maybe not in the status field (as IMHO this should be owned by the grafana reconciler) but an annotation like operator.grafana.com/applied-notification-policy: namespace/notification-policy-name/uid could help here

NissesSenap commented 1 month ago

Could an idea be to use watch https://book.kubebuilder.io/reference/watching-resources/externally-managed to listen for the grafana instances.

So if there are 2 policies for the same instance, and the policy that was created first is deleted. The annotation on grafana will be deleted as part of the delete process. There will be no way for the second policy to be added (except a retry of the reconciler), but if we have a watcher we should be able to check if the annotation has changed and then start to reconcile the "new" policy.

theSuess commented 1 month ago

The annotation based locking approach is now implemented. I tried my hand at event watchers but couldn't figure out a good way to trigger reconciliation once the Grafana instance is freed (happy to take input here).

Other than that, this PR is ready for code review from my side

michelle-douglas commented 1 month ago

Hi Team, Can I get an update on this? I have a customer (Alkami Technology) who is waiting on this and has offered to test it with us if needed. Please advise. Thanks! Michelle