fluxcd / helm-controller

The GitOps Toolkit Helm reconciler, for declarative Helming
https://fluxcd.io
Apache License 2.0
404 stars 161 forks source link

Missing some crucial events #889

Closed artemlive closed 6 months ago

artemlive commented 7 months ago

Hello! I'm curious why there are no events like: https://github.com/fluxcd/helm-controller/blob/56e36da8c15fe4aef06d257c3295044e53874cab/internal/controller/helmrelease_controller.go#L410 https://github.com/fluxcd/helm-controller/blob/56e36da8c15fe4aef06d257c3295044e53874cab/internal/controller/helmrelease_controller.go#L416 in the new version of the helm controller. Today, we were surprised when we upgraded, and there weren't any of these events.

We built a notification system based on these events that is critical for us and our internal processes. It's really convenient to have granular events to know when the install or upgrade starts. Don't you mind if we contribute something like this to the new version of the controller? Thanks!

darkowlzz commented 6 months ago

Hi, I can try explaining the reasoning behind this as I understand. The previous version was written when we didn't had much conformity amongst the controllers and well defined standards of when to send notifications. Over the last few years, as we refactored the flux controllers, we developed new designs for our controllers that made all the controllers work in a similar manner making them more uniform. One of the common patterns in the new design is to send notifications only when necessary to improve the user experience and prevent flood of notifications. As a result of this, we now send notification about the result of reconciliations, failed or succeeded or recovered from a previous failure. We don't send notifications for reconciliations that don't result in any effective change. The individual actions performed as part of the reconciliation is an implementation detail. In order to improve observability and provide details about the current state of HelmRelease, we adopted kstatus and reflect the current state in the status conditions. These are well defined, machine readable APIs that kubernetes applications can read and operate on. All the flux controllers expose such operational details through the status conditions. These are also used in the new flux monitoring setup, see https://fluxcd.io/flux/monitoring/custom-metrics/ for details about how we use kube-state-metrics to export flux resource metrics which can then be read and used in other observability systems to monitor and set up alerts or other automation on any custom state of the resource. Similarly, any other system should be able to consume flux resource state and build on top of it. You may not get the exact event state like before but it should be possible to interpolate and obtain the necessary information.

We can discuss more about it and try to provide the exact information needed from the status conditions based on the use case. Since helm-controller v0.37.x is the first version with the new design, there must be a lot of space for improvements.

artemlive commented 6 months ago

Hello @darkowlzz! First of all, thanks for such a detailed explanation! We've decided to use the kstatus approach with conditions. The manifest already has all the information we need, so we'll build a solution on top of it.

Thanks for the clarification!