Open daniel-anova opened 1 year ago
Could you let us know what version of Flux is installed?
I'm looking at this issue and it reminds me of an issue that affected Flux's "new" Server Side Apply when it was still very new, but it's been resolved for some time. It might not be the same issue, but in the interest of ruling it out quickly, what is the output of flux check
(it includes the version information)
Also, maybe this: was there any error when the HelmRelease change was not successfully applied? (It seems the changes were ignored, maybe it was an error that actually prevented the reconciliation from completing.)
Eg. at this point,
and values are set to:
values:
env:
- name: ihave
value: "apple"
The resulting deployment env becomes:
env: []
it seems like the change perhaps hasn't actually been applied, can you check if there is any error reconciling the HelmRelease or parent Kustomization that might explain why this happened?
This could be a bug in Helm's three-way merge, or something else. Thank you for reporting this! I think you've given us enough information here to reproduce it, though it may take some time as we are working on issues for GA as well as KubeCon, (but if this is a bug then obviously we'd like to squash it before GA.)
Could you let us know what version of Flux is installed?
I'm looking at this issue and it reminds me of an issue that affected Flux's "new" Server Side Apply when it was still very new, but it's been resolved for some time. It might not be the same issue, but in the interest of ruling it out quickly, what is the output of
flux check
(it includes the version information)
Currently on v0.24 of the Helm Controller.
it seems like the change perhaps hasn't actually been applied, can you check if there is any error reconciling the HelmRelease or parent Kustomization that might explain why this happened?
Test manifests where applied directly with kubectl
so Kustomization Controller was not involved.
As for Helm Controller log it shows nothing relevant, simply that reconciliation finished:
time | message |
---|---|
2022-10-06 09:17:23 | msg="reconcilation finished in 57.883487ms, next run in 5m0s" |
2022-10-06 09:15:30 | msg="reconcilation finished in 45.435263ms, next run in 5m0s" |
2022-10-06 09:12:11 | msg="reconcilation finished in 9.175023225s, next run in 5m0s" |
2022-10-06 09:10:43 | msg="reconcilation finished in 12.716648708s, next run in 5m0s" |
2022-10-06 09:10:30 | msg="reconcilation finished in 83.638857ms, next run in 5m0s" |
I wonder if this is a bug in Flux at all, or if it's even a bug in Helm
We need to see if this behavior reproduces in the Helm CLI if you try a similar thing. I'm wondering if it's somehow part of the three-way merge behavior. You mentioned that you did kubectl apply
and aren't using Kustomize controller – just to clarify that could be part of the issue as well. If you did kubectl apply
then it's using a three-way merge, and you need to be quite sure the HelmRelease that's represented on the cluster says what you think. It might have merged the values in the array.
If you were using Kustomize controller here, it does not use kubectl apply
with three-way merge in quite the same way, Flux will actually wipe out any drift (unless you needed three-way merge, and there's a way to opt into it) but in the kubectl
cli for Kubernetes, kubectl apply -f
by default is a three-way merge, and you can get strange behavior from it. Broadly speaking, the server-side apply feature in Kustomize controller fixes this.
If you can confirm whether the YAML on the cluster has really been affected, one way to see is to use kubectl edit
in between and verify the YAML looks how you expect, another way to mitigate this problem would be to see if it behaves the same when you do kubectl replace
instead of kubectl apply
.
Can you check on these avenues so we can verify this is a bug before we spent more time to investigate further?
The HelmRelease was correct with both kubectl
and Kustomize controller, I just retested the test case to be 100% sure and when removing the duplicate value the HelmRelease
shows the remaining env variable as expected, the same behavior is seen when using the controller.
The helm rendered Deployment
gets all env variable entries removed, further reconciliation does not add the existing var in the HelmRelease
back to the deployment.
I hope that clarifies things.
Thanks @daniel-anova – I hope to see that someone can follow this up before too long after KubeCon.
I won't be able to look at it again before then myself
I have stumbled on this in a bit different way, but I can tell for sure that Helm Controller does not comply to how Helm works.
Here the case - basically I am trying to leverage the K8S rules of merging and defining same variable in different places https://kubernetes.io/docs/reference/kubernetes-api/workload-resources/pod-v1/#environment-variables
Simplified the precedence of environment variables is the following
env
envFrom
ENV
within the imageIf it happen that in per section e.g. env
there are duplicate names for environments - the last one wins.
So having a deployment.yaml
template that has:
...
{{- if or .Values.env .Values.envOverride }}
env:
{{- end }}
{{- if or .Values.env }}
{{- toYaml .Values.env | nindent 10 }}
{{- end }}
{{- if or .Values.envOverride }}
{{- toYaml .Values.envOverride | nindent 10 }}
{{- end }}
...
given values.yaml
env:
- name: DEBUG
value: 'A'
envOverride:
- name: DEBUG
value: 'B'
Using Helm Controller produces a deployment
having only the first item of the list:
...
env:
- name: DEBUG
value: 'A'
...
Where Helm itself produces a warning W0603 16:56:14.612333 378861 warnings.go:70] spec.template.spec.containers[0].env[6].name: duplicate name "DEBUG"
and a deployment
having both items of the list as intended to work with the precedence of K8S described above:
...
env:
- name: DEBUG
value: 'A'
- name: DEBUG
value: 'B'
...
So at the using Helm Controller the resulting value in the container will be DEBUG=A
and that is not correct,
where using Helm the resulting value in the container will be DEBUG=B
that is correct.
I wonder if this is a bug in Flux at all, or if it's even a bug in Helm
Please check this kubernetes issue (and this helm one closed because stale)
Helm controller allows creating a Helm Release with a duplicate environment variable, it deploys with just a warning about the duplication.
However, when removing one of the duplicate values Helm controller will remove all copies of that environment variable.
So if the deployment has:
and values are set to:
The resulting deployment env becomes:
The only way to restore the deployment is to delete the manifest and force reconciliation or add an additional environment variable.
When trying to add duplicate values to an existing HelmRelease however the new duplicates are ignored.
and values are set to:
The resulting deployment env becomes:
The behavior should be consistent, helm controller should never allow the duplicate values or deal with them correctly on removal.
Test case Helm Release
Relevant part of Helm Chart template