fluxcd / flux

Successor: https://github.com/fluxcd/flux2
https://fluxcd.io
Apache License 2.0
6.9k stars 1.08k forks source link

Error notifications despite the resource being successfully reconciled #3480

Closed Diaoul closed 3 years ago

Diaoul commented 3 years ago

Describe the bug

Flux sends out error level notifications despite the resource being successfully reconciled. This is the discord notification I received, in that order:

helmrelease/jellyfin.media
Helm upgrade has started
revision
7.3.2
helmrelease/jellyfin.media
Helm upgrade succeeded
revision
7.3.2
helmrelease/jellyfin.media
reconciliation failed: Operation cannot be fulfilled on helmreleases.helm.toolkit.fluxcd.io "jellyfin": the object has been modified; please apply your changes to the latest version and try again
revision
7.3.2

And when I checked later:

$ flux get helmrelease -n media jellyfin
NAME        READY   MESSAGE                             REVISION    SUSPENDED
jellyfin    True    Release reconciliation succeeded    7.3.2       False 

All of this happened in a 2 minutes time window between the start of the reconciliation and the error notification.

To Reproduce

Hard to tell. No manual intervention was made besides updating the docker image in the values of the chart on the gitops repository, all those resources are managed by flux. Last time jellyfin was reconciled it worked fine. A week ago grafana reconciliation had the same error but not after so it does not seem to be related to a helm chart in particular. My guess is that there is a conflict because flux tries to run two reconciliations at the same time of the same resource.

Expected behavior

Error notifications sent when reconciliation actually fails, maybe for a longer period of time? At least make this maybe a warning level on first occurrence. I am not sure what should be done, but throwing an error seems wrong.

Logs

N/A

kingdonb commented 3 years ago

This repo is for Flux v1, which is in maintenance mode.

I think you are looking for https://github.com/fluxcd/notification-controller or perhaps Flux v2 discussions, Q&A: https://github.com/fluxcd/flux2/discussions/categories/q-a

I'm not sure there is enough to open an issue (instructions to reproduce are not especially clear) but you may get some clarity from one of the maintainers by asking about this in the Q&A section. I think you have understood the issue, it looks like what you expected: this notification means that a reconciliation was triggered while one was already in progress, which maybe is only an important error if it happens more than a few times in a row.

I suspect this is an error class notification, among info severity of notifications. The info notifications can be pretty noisy. You can tune them out by selecting error, but in this case it wouldn't have helped, since it was an error that contributed to the noise floor here, it would be even worse without the info context showing everything really did come out alright.

https://fluxcd.io/docs/components/notification/alert/

You can configure the alert with an ExclusionList, if this message is not ever important then you can always exclude it. Maybe that's enough to resolve this for you? I'm not sure what else to suggest, flux alerts are only based on the parameters that you configure. Hope this information is helpful!

Diaoul commented 3 years ago

Thanks for your detailed answer, I know I can just make a custom rule to remove this error from even showing up but it seems wrong that this is an error as everything went well in the end. I thought I'd just let the maintainer know of this behavior and if no fix is possible I will ignore it. I've opened the issue on the correct repo, sorry for the noise!