Open cwrau opened 2 months ago
I've tried debugging this, and this line is the root cause.
Because the digest seems to change during the uninstall, which in fact returns an error, that error is not returned, helm-controller thinks it finished without an error and removes the finalizer, therefore removing the HelmRelease.
Is there a reason for the digest check? I would just open a PR removing that, if that's ok?
Because the digest seems to change during the uninstall
Why would it change? Are you using GitRepo as the source of the HelmRelease?
Because the digest seems to change during the uninstall
Why would it change?
I don't know, I debugged helm-controller
and the .Digest
of req.Object.Status.History.Latest()
between https://github.com/fluxcd/helm-controller/blob/7f78cdc36836f0ff3b652e518550932c9e890560/internal/reconcile/uninstall.go#L83 and https://github.com/fluxcd/helm-controller/blob/7f78cdc36836f0ff3b652e518550932c9e890560/internal/reconcile/uninstall.go#L141 is different.
I tried looking through the code how the req.Object
could get changed, but I couldn't figure it out😅
Are you using GitRepo as the source of the HelmRelease?
Nope, using HelmRepository, no update occured during these 60 lines 😅
@stefanprodan, ah, I think I've found it;
During helm uninstall, after changing info.Status
, helm updates the release
After the update, flux notifies itself on the HelmRelease object about the history "change"
During the observation, ObservedToSnapshot
calculates a new Digest with the updated info.Status
, which of course changes the digest
I hope I didn't miss anything 😅
Is there any news on this? Maybe my PR, #1024, is even acceptable?
Would be great to have this fixed 😊
Ola, is there something I can do to help get this resolved? Do you need more information?
@cwrau we've talked about removing the check in a dev meeting and the conclusion was that it's safe to do so. For the PR to be merged you'll need to adapt the unit tests.
Describe the bug
When you delete a HelmRelease, even with
uninstall.deletionPropagation=foreground
, flux, via helm or otherwise, marks all created resources to be deleted and then removes the finalizer, resulting in the HelmRelease disappearing even though the uninstallation hasn't finished.This, at least sometimes, leaves the "real" helm release to be stuck in
uninstalling
which has to be manually resolved.Steps to reproduce
Create a helm chart like https://github.com/teutonet/teutonet-helm-charts/tree/main/charts/t8s-cluster, which creates CRs that are not just "no-ops" for deletion, like cluster-api resources.
Install it, wait for everything to be ready, uninstall.
Expected behavior
That flux leaves the finalizer on the HelmRelease until the "real" helm release is really uninstalled.
I suppose what I mean in a non-flux way would be for flux to wait until
helm uninstall
finishes successfully.Screenshots and recordings
No response
OS / Distro
N/A
Flux version
N/A
Flux check
► checking prerequisites ✗ Kubernetes version v1.27.14 does not match >=1.28.0-0 ► checking version in cluster ✔ distribution: flux-2.3.0 ✔ bootstrapped: true ► checking controllers ✔ helm-controller: deployment ready ► ghcr.io/fluxcd/helm-controller:v1.0.1 ✔ image-automation-controller: deployment ready ► ghcr.io/fluxcd/image-automation-controller:v0.38.0 ✔ image-reflector-controller: deployment ready ► ghcr.io/fluxcd/image-reflector-controller:v0.32.0 ✔ kustomize-controller: deployment ready ► ghcr.io/fluxcd/kustomize-controller:v1.3.0 ✔ notification-controller: deployment ready ► ghcr.io/fluxcd/notification-controller:v1.3.0 ✔ source-controller: deployment ready ► ghcr.io/fluxcd/source-controller:v1.3.0 ► checking crds ✔ alerts.notification.toolkit.fluxcd.io/v1beta3 ✔ buckets.source.toolkit.fluxcd.io/v1beta2 ✔ gitrepositories.source.toolkit.fluxcd.io/v1 ✔ helmcharts.source.toolkit.fluxcd.io/v1 ✔ helmreleases.helm.toolkit.fluxcd.io/v2 ✔ helmrepositories.source.toolkit.fluxcd.io/v1 ✔ imagepolicies.image.toolkit.fluxcd.io/v1beta2 ✔ imagerepositories.image.toolkit.fluxcd.io/v1beta2 ✔ imageupdateautomations.image.toolkit.fluxcd.io/v1beta2 ✔ kustomizations.kustomize.toolkit.fluxcd.io/v1 ✔ ocirepositories.source.toolkit.fluxcd.io/v1beta2 ✔ providers.notification.toolkit.fluxcd.io/v1beta3 ✔ receivers.notification.toolkit.fluxcd.io/v1 ✗ check failed
Git provider
No response
Container Registry provider
No response
Additional context
No response
Code of Conduct
// "moved" from https://github.com/fluxcd/flux2/issues/4872