Closed ruchikaguptaa closed 1 year ago
Tagging a few folks as per the PRs to the reconcile helmchart part of code @hiddeco @seaneagan .
The helm-controller does not support this at present, nor did the Helm Operator.
I have been playing around with proof of concepts to determine changes based on the manifest blob that is stored in the Helm storage based on 2 and 3-way merges, but ran into some issues around mutating cluster webhooks that may modify the object when it is applied on the cluster by Helm.
That being said (and done), I do want to eventually support this, but do not have a time frame on when.
@hiddeco any updates, just realised that a deployment was deleted but helm was yay happy. Could be super cool if flux could detect the state was undesired :)
we are looking forward to have this feature!! Any updates?
Facing the same issue. Reconciliation does not work with resources deployed by Helm.
i'm also looking forward to have this feature! what's the current status?
You could try to do a cronjob with manual reconciliation
flux reconcile hr {{your hr}} --with-source -n {{your ns}}
The only way currently to restore the helmrelease to the defined state is to reinstall or upgrade it, or am I missing something?
I deleted a crd (and all its crs) and recreated it, however helm controller does not recreate the objects from the helm release. Now my cluster is in an inconsistent state and it does not recover. Even if I call flux reconcile helmrelease
it does not recreate the deleted resources from the helm template.
After consultation with the Helm maintainers, the blue print for this is in: https://github.com/bacongobbler/helm-patchdiff/blob/8cbbc9b4716330fd4e69b5fea93320fa5da1c352/main.go#L289-L294. This could be implemented after the planned work in https://github.com/fluxcd/helm-controller/milestone/1 has landed (and our testing suite has been strengthened).
flux suspend ...
, then flux resume ...
will do the trick to recreate the resource which being deleted manually
we are looking forward to have this feature, Any updates about when it could be available?
Should be there in the docs: the moment you start using HelmRelease = you lose all the GitOps flow
Any news about this issue? @hiddeco is there any plan for supporting reconciliation in case of changes manually applied by user ?
This topic comes up a lot, I wanted to reiterate these thoughts I placed on #267 originally, but while they don't exactly belong on that discussion, they can perhaps go here:
Helm uses a 3-way merge to incorporate: (1) the old manifest, (2) its live state on the cluster, and (3) the new manifest.
Helm also provides various opportunities for the installing agent to manipulate the state before, around and after install, both within the chart (lifecycle hooks) that are permitted to be non-deterministic (and so may have outputs that are not declaratively formed), and also outside of the chart, with Post Render behavior, that outside of Flux, would also be capable of non-deterministic behavior.
Simplified: The installer can create secrets on the first run (helm install
), that won't be generated again during a second run (helm upgrade
).
Though Flux makes the postrender behavior declarative, it's still running the Helm code under the hood, which does not assume declarativeness, and that means we can't assume anything that Helm itself would not have assumed or guaranteed.
Every upgrade produces a new release
artifact, in the form of a secret of the type: helm.sh/release.v1
So when you have just installed for the first time and you have a handful of Helm charts, your cluster-wide secrets may look like this:
chartmuseum sh.helm.release.v1.chartmuseum-chartmuseum.v1 helm.sh/release.v1 1 4d13h
deis sh.helm.release.v1.deis-hephy.v1 helm.sh/release.v1 1 4d13h
harbor sh.helm.release.v1.harbor.v1 helm.sh/release.v1 1 4d13h
ingress-nginx sh.helm.release.v1.ingress-internal.v1 helm.sh/release.v1 1 4d13h
ingress-nginx sh.helm.release.v1.ingress-public.v1 helm.sh/release.v1 1 4d13h
kube-oidc-proxy sh.helm.release.v1.kube-oidc-proxy.v1 helm.sh/release.v1 1 4d11h
minio-lusers sh.helm.release.v1.minio-lusers-minio.v1 helm.sh/release.v1 1 4d13h
minio-stage sh.helm.release.v1.minio-stage-minio.v1 helm.sh/release.v1 1 4d13h
monitoring sh.helm.release.v1.kube-prometheus-stack.v1 helm.sh/release.v1 1 2d22h
monitoring sh.helm.release.v1.kube-prometheus-stack.v2 helm.sh/release.v1 1 2d22h
openvpn sh.helm.release.v1.openvpn.v1 helm.sh/release.v1 1 4d13h
Three-way diffs are complicated, even more complicated than three-way merges. Now imagine instead of comparing the inputs to the outputs, Helm controller just runs an upgrade each time, producing a new revision and a new versioned secret.
The two secrets shown here for v1
and v2
revisions in the monitoring
namespace will quickly fall off the edge, as new release secrets are generated upon each upgrading reconcile, and we've effectively made Helm's rollback
feature useless. Helm controller can only infer from the changing state in any inputs, that it should perform an upgrade.
I'm not sure if philosophically this can mesh with the "no imperative magic" philosophy of GitOps, that expects all changes to be declarative, but if there was a safe place to put such an annotation in the spec that was guaranteed not to have any other side-effect on anything down-stream other than to force a single reconciliation that would definitely result in an Upgrade as long as it succeeds, it would certainly make sense for the CLI to include a way to trigger that behavior. Part of the problem is that you can't guarantee nobody is watching.
We usually recommend people who are struggling with this issue to find the remediationStrategy
in the spec for install
and upgrade
, and try fine-tuning it. The default behavior is to rollback when upgrade fails, but it is not very tolerant of failures on install
, as it can not roll back to a previous release in the event of install failure; users may want to adjust the timeout upwards or downwards so that failures are less touchy (or don't wait so long, the default 5m
), or to retry a fixed number of times even when install fails, but those behaviors are not the defaults to avoid an unexpected interaction from trying again/an infinite number of times until success causing havoc.
https://fluxcd.io/docs/components/helm/helmreleases/#configuring-failure-remediation
The least wrong way to approach this now, is to include some values from a configMap that will not be read by the chart:
https://fluxcd.io/docs/guides/helmreleases/#refer-to-values-in-configmaps-generated-with-kustomize
and update the configmap with a new nonce when you want to do the force-upgrade.
Some users have reported they can trigger an upgrade from the failed state when running flux suspend helmrelease <hr>
and flux resume helmrelease <hr>
but this is not reliable in my experience, depending too much on the type of failed state that your helmrelease landed in.
Understanding that none of this directly addresses the original topic "why doesn't Helm Controller resolve drift in my cluster" it seems that many people who are asking for that behavior, really are willing to use a "Closed Loop" and don't expect drift to come into their cluster from random places, but they are just looking for infinite retries and failure remediation, and those are all options which can be configured... though it is not the default behavior to retry infinitely, and for some very good reasons.
philosophically
My philosophical answer so far: rewrite all the charts (using helm template
+ manual changes) into kustomizations.
if fluxcd works like this, then how does flagger handle the blue/green deployment?
Flagger uses Git to facilitate rollback, rather than Helm.
https://github.com/fluxcd/flagger/blob/main/docs/gitbook/tutorials/canary-helm-gitops.md
Although you can use Flagger and Helm together, the canary in this example is inside of the Helm chart, not around or outside of the Helm Release. So when the canary fails, the traffic continues routing to the old primary which is not affected or upgraded (it's the prior state of the Helm release's deployments, associated refs from the canary spec, HPAs, etc.).
You move on to the next state the same as with any Flagger canary, by either git-reverting and allowing a new canary to run again with the old release, or perhaps more likely and more beneficial, by letting the Canary keep the stable release going, and pushing another release after this with the fix to whatever issue you observed in the canary.
I don't think you really would likely want to configure automatic rollback or other failure remediations together with Flagger. Actually, I'm pretty sure you'd want to disable remediation in that case.
This is on my to-do list as part of an overhaul of the general logic of the reconciler. Given the changes that at times occur to my day-to-day priorities, I can not provide an ETA at this time other than "next (next) minor release with more bigger changes".
If k8s kustomize supported a new option in configMapGenerator to instruct "kustomize build" to set a new .metadata.name for the generated ConfigMap object each time it is invoked, even when the ConfigMap's .data remains the same (e.g. adding some timestamp suffix to the ConfigMap object name)) that would really help here: getting a HelmRelease reconciliation each time the Flux2 Kustomization manifest wrapping the HelmRelease objct is reconciled (as set by its .spec.interval)
This would require low effort from Flux2, i.e. compiling it against the new kustomize pkgs (provided from k8s guys) supporting that new configMapGenerator option.
Could this be a possible solution to try here?
If it's a new feature from Kustomize upstream, I'd imagine it will be supported by Flux if possible, but if it acts as you describe I think it's not going to have the intended effect, instead something more like a new deployment rev or new instance of the job every single spec.interval
– that would be less than ideal.
I guess that you wanted this in order to achieve drift correction with Helm only with the cadence of Kustomize controller's reconciliation, since that is the topic here. Hypothetically it would be possible, with some of the memory and performance advancements I think Helm controller today is a lot better equipped for this than Helm controller at any time from 3-12 months ago thanks to caching and other advances. Not sure this proposed approach is the way to go about it though.
I'm not sure if the result will likely be good for Helm secret storage as the retention will run any meaningful changes off the history in short order with an endless stream of near-noop releases, having one being promoted every interval. The design of the controller is intentional about not running up the release counter when nothing really has changed, so in the event it's needed, Helm rollbacks from Helm secret will still be meaningful and useful more than 10 minutes after the fact of a release.
Since there are changes, and revisions are immutable, the new configmap in the definition of each container will likely also trigger a rolling of the deployment or a retrigger of the job if any, etc. for various reasons I think this solution as you propose is not ideal, a better solution would have ideally found a way to achieve drift detection without this side-effect of spamming the release history and potentially also with a better awareness of the Helm lifecycle hooks (maybe even user-facing control of whether re-executions should retrigger the hooks or not, I don't know how granularly this is possible or even if at all).
I'm not sure how much of that's planned now, but it is in the milestone for Flux 2.0 GA so I think broadly we can count on this hopefully being resolved in a way that is a bit more ergonomic and no-surprises than adding a purely-random-suffix feature from Kustomize upstream.
Yes, I am looking for an automated way (i.e. without the need for any manual "git commit" to be applied) to make the .spec
of a Flux2 HelmRelease object to "change" so the helm-controller has to reconcile that HelmRelease object and so handle the possible drift correction (e.g. someone applied kubectl delete
command on an API object "owned" by the related Helm release).
If I run a helm upgrade
command on an existing Helm release (in a given namespace) passing (1) the same Helm chart currently applied in that Helm release with (2) the exact deployment specific Helm param settings (i.e. values) I used in my previous helm upgrade --install
command => Yes I get a new revision of the Helm release (reflected in the new Helm related k8s Secret object) BUT no any patch is applied by Helm's k8s API client on the Helm release owned k8s API objects (as the rendered yamls are the same) ... except if someone manually modified the Helm released owned API object (with those kubectl
commands)
Note the ConfgMap I proposes to get a new name for each time flux2 Kustomization is reconciled (as instructed by its .spec.interval
) is the one referred from the k8s HelmRelease manifest wrapped by that flux2 Kustomization manifest => so no any Deployment/StatefulSet/DaemonSet's pods in the scope of the Helm release will be re-created if the content of that ConfigMap is the same as it was in the previous flux2 Kustomization reconciliation (as the Helm param settings to apply for Helm rendering are the same).
Only the pre/post upgrade Jobs will be triggered ... and I would expect they to be "prepared" for not doing/changing anything when not really needed (if doing it means a poor pre/post upgrade Helm hooks design as that Helm charts is not prepared to run a helm upgrade
on the same chart and values nowadays deployed). But I understand we should not avoid/preclude potential new features because of poor design of some Helm charts ...
But, anyway, Flux2 helm-controller could still cope with this: when realizing the helm upgrade is to be run on the same Helm chart currently deployed with the exact same Helm param settings previously applied: it could automatically run the upgrade without invoking the pre/post hooks (the --no-hooks
option supported in helm install/upgrade/rollback/uninstall
commands that I presume is also available in the Helm SDK, I did not check that ...).
P.S: Just to clarify: I do not know about any intention from k8s Kustomize guys to add something like this in the configMapGenerator (it is nothing more than a "possible request" to them).
Actually I can simulate all this adding, to the k8s kustomization.yaml's configMapGenerator two files:
As the HelmRelease .spec.valuesFrom points to a different ConfigMap each time I change (and git commit) the content of "discard,yaml" (once the flux2 kustomization reconciliation runs) => a new reconciliation is run by the helm-controller on that updated HelmRelease object and so kubectl changes applied on the API objects "owned" by that Helm release are "repaired" even when keeping untouched the referenced Helm chart version (HelmRelease
object's .spec.chart.spec
stanza) and keeping untouched the content of the values.yaml file to apply.
If I instead change whatever other file in the git repo that is not referred from the k8s kustomization.yaml placed in the git repo path pointed from the flux2 Kustomization manifest -> the Helmrelease reconciliation does not run (as expected, as the HelmRelease .spec did not change) so the potential drift is not "repaired".
The k8s Kustomize configMapGenerator feature commented above would be only a "trick" to be able to get the "desired" change in the HelmRelease object .spec
stanza without having to apply any manual step (i.e. git commit
...).
@kingdonb you wrote above:
The design of the controller is intentional about not running up the release counter when nothing really has changed, so in the event it's needed, Helm rollbacks from Helm secret will still be meaningful and useful more than 10 minutes after the fact of a release.
I do not see how this proposed behavior could spoil the current installRemediation / upgradeRemediation behavior .... (???)
Additionally, as part of the tests I am running, I just modified the .spec.interval of a HelmRelease object and it triggered a reconciliation from the helm-controller (I understand triggered from the mismatch between the HelmRelease object's .metadata.genetion and the .status.observedGeneration) what makes sense to me => a helm upgrade
was applied using the exact same Helm chart and values as currently deployed (as I did not change anything else).
But then all the arguments against running a HelmRelease reconciliation when chart and values are kept the same could apply here ... but I think we all agree the reconciliation applied by the helm-controller because of whatever change in the .spec stanza is right in this case. Why then it would be wrong to do the same periodically (if so set/requested in an enhanced HelmRelease object) to correct possible drifts and so getting a perfect also at Helm level GitOps handling from Flux2?
the current installRemediation / upgradeRemediation behavior
This is not the only thing we have to worry about nor the only functional use of the helm history. Strict compatibility with the Helm client is a highly important goal for Helm Controller and we have been able to preserve that strong compatibility until now. We have this guarantee for users that they can come and go from Helm Controller as they please, and no other GitOps tool offers this today; the Helm history is maintained exactly as if you created a release using the helm
CLI and upgrades from Helm Controller look exactly like upgrades from a Helm CLI user, they are functionally indistinguishable.
Users can execute a manual rollback after the release is marked Succeeded
(differs from the built-in remediation which only executes when a release is Failed) and they will get back to the state they were in before the upgrade. That's what I'm suggesting would break if we simply run the helm upgrade logic in a loop every spec.interval
– the helm history
keeps the result of each release in a secret of type helm.sh/release.v1
and you get a new revision each time an upgrade is executed, whether it changes anything or not.
NAME TYPE DATA AGE
sh.helm.release.v1.my-release.v1 helm.sh/release.v1 1 15d
sh.helm.release.v1.my-release.v2 helm.sh/release.v1 1 15d
sh.helm.release.v1.my-release.v3 helm.sh/release.v1 1 15d
sh.helm.release.v1.podinfo.v1 helm.sh/release.v1 1 15d
sh.helm.release.v1.podinfo.v2 helm.sh/release.v1 1 15d
sh.helm.release.v1.podinfo.v3 helm.sh/release.v1 1 15d
Drift detection should not break the function of helm history
or a manual helm rollback
which allows users to manage releases manually after suspending the HelmRelease
, so their changes are not overwritten. Sure users can do rollbacks in the GitOps way, but we have historically always protected the capability of reversion to the old way of managing HelmReleases during an incident or when there is some other need.
If Helm Controller is going to perform drift detection and preserve the state of each release according to GitOps, it must not break this invariant in the process. That's all I'm imputing here, I am not the expert on this and I'm simply commenting as a passenger, not as a decision maker. I think the decision has already been made to implement this.
I do not know the details of those decisions. However, I know what will go wrong if we simply run upgrade on every interval; please take a look at --history-max
in the Helm manual (or MaxHistory
in the Flux API docs) Helm will only maintain the most recent 10 releases in history by default. History that is older than 10 releases will not be saved.
We had a bit of discussion about this at the most recent Flux Dev Meeting in public, it was recorded, and should appear on https://www.youtube.com/channel/UCoZxt-YMhGHb20ZkvcCc5KA soon (but I see we are a bit behind our backlog for uploading these recordings.) The short version is, the feature is planned to be implemented, and there are these hurdles.
Thanks @kingdonb for the time you are putting on this (really appreciated ;-)
Yes, I know about the per-Helm_release_revision k8s Secret objects and the --history-max
limits, things you mention above being the drawbacks/cons from my proposal ... they being like the "price to pay" (at least in my mind) in order to assure a potential drift (due to some kubectl delete/patch
like command applied on API objects owned by the involved Helm release) is solved in the next HelmRelease
reconciliation.
But I am more than happy if a coming flux2 (helm-controller) release is solving it in a different way that does not add those additional drawbacks ;-)
Thanks again for sharing your view and future flux2/helm-controller plans !
I am also looking for this feature implemented, as automatic reconciliation was the main feature due to which we started using Flux for our K8s deployments. I came to know about this recently when some one manually scaled down the replica count of a deployment to 0 & flux didn't reconcile it back to the previous state, until we suspend/resume the HelmRelease (would be good if we can clearly call this out in Flux Docs). @kingdonb @hiddeco Thanks for sharing the plan of implementing this feature in Flux 2.0. I can see that suspending/resuming a HelmRelease does a reconciliation for the corresponding deployment & revert any manual changes done. However it also has all the drawbacks which you mentioned previously for the configMap like loosing HelmRelease versions / triggering LifeCycle hooks etc.
However if I am ok,
Do you see any other challenges/unforeseen issues in suspending the HelmReleases & resuming it periodically using a script. Since this is already getting implemented in Flux 2.0, once the feature is out, I can just discard this script & start using Flux native feature. Since we are having lot of deployments which are managed via Flux HelmRelease, manually adding/reverting the configMap option shared above might be difficult (or making sure that everyone remember to add it).
I can see that suspending/resuming a HelmRelease does a reconciliation for the corresponding deployment & revert any manual changes done. However it also has all the drawbacks which you mentioned previously for the configMap like loosing HelmRelease versions / triggering LifeCycle hooks etc.
That sounds like a solid understanding of the issues at play 👍
Do you see any other challenges/unforeseen issues in suspending the HelmReleases & resuming it periodically using a script.
Generally the main issue is that someone is making changes outside of the gitops delivery, which is more like what I'd consider a root cause here. Does this happen all the time? I tell people when they're using GitOps, they should treat kubectl apply -f
or kubectl edit
as a "break glass/emergency" tool.
You should have monitoring in place, such that when this person has scaled down the important deployment, an SRE/op gets notified and someone has to acknowledge the issue. If you have monitoring but it doesn't let you know when this important service has no backends/endpoints then it's an issue that should be resolved on the monitoring side. When you follow the Flux prometheus guide, you get a lot of these building blocks – I'm not sure how one configures Prometheus to let you know when a service has no backends, but I'm sure there must be a way. (Wish I could be more helpful there...)
The script approach doesn't have any drawbacks I can think of other than the ones you've mentioned. There is some performance impact, because each HelmRelease upgrade has to load the chart and will consume a lot of resources while the upgrade is in progress.
I definitely would expect the script to run every (N) hours rather than every (N) minutes for this reason. But if you are on the latest version of Flux, many of the performance issues will have been mitigated. It really depends a lot on how your sources are arranged and whether you've adopted the new type: oci
HelmRepository, which is way more efficient than others that came before.
The monitoring that I mentioned will give you an idea of how great is the performance impact, if you've configured it according to the Flux monitoring guide.
This change supporting drift correction in Helm Controller is expected Real Soon Now, so please be prepared to throw the script away very soon. 🥸👩💻
@kingdonb do you know/have any estimation on when this feature could be in Flux2's helm-controller?
In flux v1, if a deployment spec was known to have been manually edited in the cluster, we would kubectl delete hr/<application> --cascade=false
. This would prompt flux to recreate the HelmRelease object in the cluster, triggering Helm to revert any drift with minimal service disruption (and without needing dummy git commits to the HelmRelease manifest).
Is this still the way flux2 works?
In flux v1, if a deployment spec was known to have been manually edited in the cluster, we would
kubectl delete hr/<application> --cascade=false
. This would prompt flux to recreate the HelmRelease object in the cluster, triggering Helm to revert any drift with minimal service disruption (and without needing dummy git commits to the HelmRelease manifest).Is this still the way flux2 works?
@benjimin yes it works too + there's a sequence of flux suspend hr
and flux resume hr
CLI commands that will also fix the drift.
And there is a third way, for when you really don't want to incur downtime, if you have a failed release with upgrade retries exhausted
then you can delete the latest (failed) version of the helm secret (which Helm itself uses for release accounting under the hood) and then reconcile the HelmRelease to try again. This works when it's "super stuck" but in general it also works fine to just suspend and resume the helmrelease, that gets you "another try" from the installing reconciler.
This feature is planned to be experimentally supported in the next minor release, after which it will be enabled by default once we have landed things like the long due #532 (and follow ups on this). The reason it will be first experimentally supported is that:
On a second note, I want to let you know that for the foreseeable future (or unless things of utter priority come up) the helm-controller now has my full attention. Which took some time after we reshaped our roadmap to focus on the Flux Source APIs first (and stop running around like crazy), but means that I am at present "ahead of schedule".
Closing this in favor of #643. Please provide (short) feedback there, or create specific issues for problems you're experiencing. Thank you all!
flux suspend ...
, thenflux resume ...
will do the trick to recreate the resource which being deleted manually
It seems that above solution doesn't work anymore, but using HelmRelease.spec.driftDetection.mode
from #643 worked for me, so it's all fine! :)
Thank you!
Suppose
HelmRelease
object deploys a helm chart that contains a deployment. The pod resource can now be updated by the user. Until there is a chart version update inHelmRelease
object, the pod will continue to have the manually updated setting. I'm looking for a way to force reconcile the resources deployed by theHelmRelease
after every x minutes. Does helm-controller support this?There are a few other intervals like one used by HelmRepository and HelmRelease but they don't seem to reconcile a previous succeeded release unless there is a new version in the HelmRelease.
requeueDependency
interval is only responsible for reconcile if the dependencies for that helmrelease failed. These seem to be unrelated to my scenario.