Open schmidt-i opened 1 year ago
Can you post here please the output of these commands:
flux version
kubectl get helmreleases.helm.toolkit.fluxcd.io -n blueprint prometheus-msteams --show-managed-fields -o yaml
Sure
Can you post here please the output of these commands:
flux version
$ flux version flux: v0.35.0 helm-controller: v0.25.0 image-automation-controller: v0.26.0 image-reflector-controller: v0.22.0 kustomize-controller: v0.29.0 notification-controller: v0.27.0 source-controller: v0.30.0
kubectl get helmreleases.helm.toolkit.fluxcd.io -n blueprint prometheus-msteams --show-managed-fields -o yaml
$ kubectl get helmreleases.helm.toolkit.fluxcd.io -n blueprint prometheus-msteams --show-managed-fields -o yaml apiVersion: helm.toolkit.fluxcd.io/v2beta1 kind: HelmRelease metadata: creationTimestamp: "2022-06-24T15:23:00Z" finalizers:
finalizers.fluxcd.io generation: 4 labels: kustomize.toolkit.fluxcd.io/name: blueprint kustomize.toolkit.fluxcd.io/namespace: flux-system managedFields:
apiVersion: helm.toolkit.fluxcd.io/v2beta1 fieldsType: FieldsV1 fieldsV1: f:metadata: f:labels: f:kustomize.toolkit.fluxcd.io/name: {} f:kustomize.toolkit.fluxcd.io/namespace: {} f:spec: f:chart: f:spec: f:chart: {} f:sourceRef: f:kind: {} f:name: {} f:namespace: {} f:version: {} f:dependsOn: {} f:install: f:remediation: f:retries: {} f:interval: {} f:releaseName: {} f:values: {} f:valuesFrom: {} manager: kustomize-controller operation: Apply time: "2022-10-11T16:08:39Z"
apiVersion: helm.toolkit.fluxcd.io/v2beta1 fieldsType: FieldsV1 fieldsV1: f:metadata: f:finalizers: .: {} v:"finalizers.fluxcd.io": {} manager: helm-controller operation: Update time: "2022-06-24T15:23:00Z"
apiVersion: helm.toolkit.fluxcd.io/v2beta1 fieldsType: FieldsV1 fieldsV1: f:status: f:conditions: {} f:helmChart: {} f:lastAppliedRevision: {} f:lastAttemptedRevision: {} f:lastAttemptedValuesChecksum: {} f:lastReleaseRevision: {} f:observedGeneration: {} manager: helm-controller operation: Update subresource: status time: "2023-01-19T20:56:50Z" name: prometheus-msteams namespace: blueprint resourceVersion: "136463080" uid: d76ace06-4b6f-442a-bbac-dd22738adf9c spec: chart: spec: chart: prometheus-msteams reconcileStrategy: ChartVersion sourceRef: kind: HelmRepository name: prometheus-msteams namespace: blueprint version: 1.3.1 dependsOn:
name: prometheus-operator install: remediation: retries: 3 interval: 1m releaseName: prometheus-msteams values: customCardTemplate: '{{ define "teams.card" }} { "@type": "MessageCard", "@context": "http://schema.org/extensions", "themeColor": "{{- if eq .Status "resolved" -}}2DC72D {{- else if eq .Status "firing" -}} {{- if eq .CommonLabels.severity "critical" -}}8C1A1A {{- else if eq .CommonLabels.severity "warning" -}}FFA500 {{- else -}}808080{{- end -}} {{- else -}}808080{{- end -}}", "summary": "{{- if eq .CommonAnnotations.summary "" -}} {{- if eq .CommonAnnotations.message "" -}} {{- js .CommonLabels.cluster | reReplaceAll "" " " | reReplaceAll "-" " " | reReplaceAll
\''
"''" -}} {{- else -}} {{- js .CommonAnnotations.message | reReplaceAll "" " " | reReplaceAll "-" " " | reReplaceAll\''
"''" -}} {{- end -}} {{- else -}} {{- js .CommonAnnotations.summary | reReplaceAll "" " " | reReplaceAll "-" " " | reReplaceAll\''
"''" -}} {{- end -}}", "title": "Prometheus Alert ({{ .Status }})", "sections": [ {{$externalUrl := .ExternalURL}} {{- range $index, $alert := .Alerts }}{{- if $index }},{{- end }} { "activityTitle": "[{{ js $alert.Annotations.description | reReplaceAll "" " " | reReplaceAll\''
"''" }}]({{ $externalUrl }})", "facts": [ {{- range $key, $value := $alert.Annotations }} { "name": "{{ $key }}", "value": "{{ js $value | reReplaceAll "" " " | reReplaceAll\''
"''" }}" }, {{- end -}} {{$c := counter}}{{ range $key, $value := $alert.Labels }}{{if call $c}},{{ end }} { "name": "{{ $key }}", "value": "{{ js $value | reReplaceAll "" " " | reReplaceAll\''
"''" }}" } {{- end }} ], "markdown": true } {{- end }} ] } {{ end }}' metrics: serviceMonitor: enabled: true scrapeInterval: 30s replicaCount: 2 resources: limits: cpu: 30m valuesFrom:kind: ConfigMap name: prometheus-msteams-config-values optional: true status: conditions:
lastTransitionTime: "2023-01-19T20:56:50Z" message: Release reconciliation succeeded reason: ReconciliationSucceeded status: "True" type: Ready
lastTransitionTime: "2022-10-11T16:10:09Z" message: Helm upgrade succeeded reason: UpgradeSucceeded status: "True" type: Released helmChart: blueprint/blueprint-prometheus-msteams lastAppliedRevision: 1.3.1 lastAttemptedRevision: 1.3.1 lastAttemptedValuesChecksum: 4c81287381ac4d31719d9a83a6711baed5b92daf lastReleaseRevision: 3 observedGeneration: 4
So fi you commit the HR without customCardTemplate
it doesn't get removed? If so does the Kustomization reports any errors? You can check with flux get kustomization
.
exactly, the customCardTemplate is removed. There are no erros on the kustomizations:
$ flux get kustomization
NAME REVISION SUSPENDED READY MESSAGE
blueprint 7.8.0/2b0a442 False True Applied revision: 7.8.0/2b0a442
blueprint-extras 7.8.0/2b0a442 False True Applied revision: 7.8.0/2b0a442
The logs of the kustomize-controller tell that the HelmRelease is unchanged: "HelmRelease/blueprint/prometheus-msteams":"unchanged",
Hmm but your using a tag 7.8.0
, does it contain the customCardTemplate
removal?
Yes. This release containes the patch were the value was removed. Was the first thing I checked.
Hello, we are hit by the same problem on different resources on multiple clusters from different providers for few weeks now. Some values keys in HR are not removed by the reconciliation.
We are currently running v0.38.2
, on v1.24.8-gke.2000
and v1.23.6
The weird thing is that even if we go k apply
directly the resource to get around the problem, the key is still not removed and kubectl reply with unchanged
(except the last-applied-configuration). I suppose it may be linked to managed fields but I don't know enough this mechanism.
Now comparing managed fields on one resource that have the problem, only the 2 fields we are trying to remove are missing from the list.
I wasn't able to reproduce the issue on a kind cluster with Kubernetes 1.24.6 and Flux 0.35.0 from scratch so I suspect that a sequence of changes put the cluster in a state where this happens.
@schmidt-i are you able to reproduce the issue even on a fresh cluster?
I've just hit what looks like an identical issue today - do let me know if I should create a separate issue if it sounds like a different problem!
flux: v0.40.2
helm-controller: v0.29.0
image-automation-controller: v0.29.0
image-reflector-controller: v0.24.0
kustomize-controller: v0.33.0
notification-controller: v0.31.0
source-controller: v0.34.0
My scenario is that I've got a PrometheusRule with a couple of groups in it, something along these lines:
---
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
name: cloud-admission-ctl-alerts-short-span
namespace: cloud-admission-controller
labels:
target: alertmanager
spec:
groups:
- name: cloud-admission-controller
rules:
- alert: CloudAdmissionCtlDown
(...)
- name: cloud-admission-controller-probe
rules:
- alert: CloudAdmissionCtlProbeFailed
(...)
- alert: CloudAdmissionCtlProbeStale
(...)
- alert: CloudAdmissionCtlProbeHugelyStale
(...)
I am decommissioning the CloudAdmissionCtlDown
alert slowly across a fleet of clusters so I've created a json patch in kustomize for a few clusters like so:
---
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
(...)
patches:
- target:
group: monitoring.coreos.com
version: v1
kind: PrometheusRule
name: cloud-admission-ctl-alerts-short-span
patch: |-
- op: remove
path: /spec/groups/0
This patch is replicated in four separate places: one cluster and three accounts (for every account there's an overlay which is included by the clusters in that account). The result of this being applied by Flux is quite surprising because:
a) even though a local kustomize build
correctly drops that element, most of the clusters did not notice a change
b) the behaviour is inconsistent - two of the clusters did actually apply it correctly!
After a while of scratching my head and trying different things (mostly making changes to the PrometheusRule in one of the affected clusters) I tried removing the group I want removed via a manual kubectl edit
of the PrometheusRule - which not only worked but Flux did not revert that change! So this does indeed look like something is causing it to be ignored but only in certain circumstances.
@makkes's last question might actually be relevant here because the two clusters where the patch worked are pretty new (only built a couple of weeks ago) and thus have only had one version of Flux 2 deployed to them with no subsequent upgrades as well as never had any Flux 1 components deployed to them. The clusters where I'm experiencing the issue have had (and still do) Flux 1 deployed and have been through a few Flux 2 upgrades.
FWIW I'm deploying Flux2 using the community helm chart.
To make things even more interesting, just as I was writing this up I thought I should try making another edit to this PrometheusRule in the cluster where I've done the manual edit before (by adding a fake group with a fake alert) - to my surprise the next reconciliation has correctly removed the edit.
I'm in a position where I can actually leave things as they are for a few days so please do let me know if there's any further debugging I could do to triage this issue further.
Thanks!
I'm hitting similar issue, running flux 0.40.2
.
I've seen this on few helmReleases
already, when I remove a key in git repository, it does not get deleted in cluster. The key is visible/deployed in helmrelease
(also in helm get values
).
I'm not sure if it is source controller that cache this key or helm/kustomization controller. I've tried to force reconcilation by flux reconcile hr --with-source
but nothing changed. If I remove the key from helmrelease
definition, flux will not restore it. I'm wondering where those keys are cached. I've killed all flux components so it should pick up clean state but the key was not removed.
We have 8 clusters and each cluster shares the same config, the behavior is random as on some clusters the key is removed properly.
I've also tried setting upgrade.preserveValues: false
in helmrelease and then chaning some random value but that didn't remove old keys.
I'd like to know if there is any workaround that will force reinstall helm using clean values without removing resources itself.
The example of removed key from kube-prometheus-stack helm:
values:
prometheus:
prometheusSpec:
image:
tag: v2.41.0
After removing image
block prometheus keeps deploying old version instead of v2.42.0
We are on v0.38.3
and are seeing the same problem exactly as described in the first post. Its extremely concerning because it breaks the entire contract that flux 2 has (that it will apply the changes in the config). Would be much better if it at least had an error somewhere.
Its extremely concerning because it breaks the entire contract that flux 2 has (that it will apply the changes in the config).
This is pretty serious. Is there a maintainer we can ping or a path to escalation? Thanks!
We are also seeing this on 0.41.2
, and 2.0.0-rc.3
. Values removed from a helm chart in the git repo are not getting removed from the deployed helm release.
We're also seeing this on 2.0.0-rc.5
.
Like jkotiuk mentioned, manually editing the helmrelease
resource to remove the keys was a viable workaround — flux did not try to restore the previous values.
I've also seen this issue on 0.41.2
recently.
My git diff looks like this (it's part of a patch):
diff --git a/clusters/dev-sandbox-redux/flux-components/vault/values.yaml b/clusters/dev-sandbox-redux/flux-components/vault/values.yaml
index 52f453ba..ff8d1e4f 100644
--- a/clusters/dev-sandbox-redux/flux-components/vault/values.yaml
+++ b/clusters/dev-sandbox-redux/flux-components/vault/values.yaml
@@ -6,7 +6,7 @@ metadata:
spec:
chart:
spec:
- version: "v0.19.0"
+ version: "v0.20.1"
values:
global:
tlsDisable: false # Enable HTTPS (uses certificates from Cert-manager)
@@ -15,8 +15,6 @@ spec:
- name: dockerhub
server:
repository: "public.ecr.aws/hashicorp/vault"
- image:
- tag: "1.9.3"
extraArgs: "-config=/config/vault-config/config.hcl" # Get configuration from K8s secret (provisioned by Terraform)
extraVolumes:
- type: secret
@@ -76,7 +74,6 @@ spec:
injector:
agentImage:
repository: "public.ecr.aws/hashicorp/vault"
- tag: "1.9.2"
replicas: 2
annotations:
cluster-autoscaler.kubernetes.io/safe-to-evict: "false"
When I run flux build kustomization --path=./clusters/dev-sandbox-redux/flux-components/vault/ vault
I get:
apiVersion: helm.toolkit.fluxcd.io/v2beta1
kind: HelmRelease
metadata:
labels:
kustomize.toolkit.fluxcd.io/name: vault
kustomize.toolkit.fluxcd.io/namespace: flux-system
name: vault
namespace: hashicorp
spec:
chart:
spec:
chart: vault
sourceRef:
kind: HelmRepository
name: vault
namespace: flux-system
version: v0.20.1
install:
remediation:
retries: 3
interval: 1h0m0s
releaseName: vault
upgrade:
crds: CreateReplace
values:
global:
imagePullSecrets:
- name: quay
- name: dockerhub
tlsDisable: false
injector:
agentImage:
repository: public.ecr.aws/hashicorp/vault
annotations:
cluster-autoscaler.kubernetes.io/safe-to-evict: "false"
replicas: 2
server:
authDelegator:
enabled: false
dataStorage:
enabled: false
extraArgs: -config=/config/vault-config/config.hcl
extraVolumes:
- name: vault-service-cluster-zone-tls
type: secret
- name: vault-config
path: /config
type: secret
ha:
config: |
storage "consul" {
path = "vault"
address = "HOST_IP:8500"
}
telemetry {
dogstatsd_addr = "HOST_IP:8125"
}
disruptionBudget.maxUnavailable: 2
enabled: true
replicas: 5
ingress:
activeService: false
annotations:
external-dns.alpha.kubernetes.io/cloudflare-proxied: "false"
kubernetes.io/ingress.class: nginx-internal
nginx.ingress.kubernetes.io/ssl-passthrough: "true"
enabled: true
hosts: "...redacted..."
repository: public.ecr.aws/hashicorp/vault
resources:
limits:
cpu: 1
memory: 1Gi
requests:
cpu: 1
memory: 1Gi
service:
annotations: {}
updateStrategyType: RollingUpdate
So it's definitively not there in the build. But when I apply it, it doesn't prune the previously explicitly set image tag values from the HelmRelease.
They're also not detected by flux diff kustomization --path=./clusters/dev-sandbox-redux/flux-components/vault/ vault
. Kustomization diffing...: running dry-run
.. Kustomization diffing...: processing inventory
✓ Kustomization diffing...
► HelmRelease/hashicorp/vault drifted
metadata.generation
± value change
- 5
+ 6
spec.chart.spec.version
± value change
- v0.19.0
+ v0.20.1
We end up having to manually edit the HelmRelease to remove these left-behind fields.
A copy of my current HelmRelease (before any changes) is here:
apiVersion: helm.toolkit.fluxcd.io/v2beta1
kind: HelmRelease
metadata:
creationTimestamp: "2023-03-29T19:17:31Z"
finalizers:
- finalizers.fluxcd.io
generation: 5
labels:
kustomize.toolkit.fluxcd.io/name: vault
kustomize.toolkit.fluxcd.io/namespace: flux-system
name: vault
namespace: hashicorp
resourceVersion: "1585885902"
uid: b2397151-653c-46e9-8a70-fd7ee27e0977
spec:
chart:
spec:
chart: vault
reconcileStrategy: ChartVersion
sourceRef:
kind: HelmRepository
name: vault
namespace: flux-system
version: v0.19.0
install:
remediation:
retries: 3
interval: 1h0m0s
releaseName: vault
upgrade:
crds: CreateReplace
values:
global:
imagePullSecrets:
- name: quay
- name: dockerhub
tlsDisable: false
injector:
agentImage:
repository: public.ecr.aws/hashicorp/vault
tag: 1.9.2
annotations:
cluster-autoscaler.kubernetes.io/safe-to-evict: "false"
replicas: 2
server:
authDelegator:
enabled: false
dataStorage:
enabled: false
extraArgs: -config=/config/vault-config/config.hcl
extraVolumes:
- name: vault-service-cluster-zone-tls
type: secret
- name: vault-config
path: /config
type: secret
ha:
config: |
storage "consul" {
path = "vault"
address = "HOST_IP:8500"
}
telemetry {
dogstatsd_addr = "HOST_IP:8125"
}
disruptionBudget.maxUnavailable: 2
enabled: true
replicas: 5
image:
tag: 1.9.3
ingress:
activeService: false
annotations:
external-dns.alpha.kubernetes.io/cloudflare-proxied: "false"
kubernetes.io/ingress.class: nginx-internal
nginx.ingress.kubernetes.io/ssl-passthrough: "true"
enabled: true
hosts: "...redacted..."
repository: public.ecr.aws/hashicorp/vault
resources:
limits:
cpu: 1
memory: 1Gi
requests:
cpu: 1
memory: 1Gi
service:
annotations: {}
updateStrategyType: RollingUpdate
status:
conditions:
- lastTransitionTime: "2023-05-31T11:40:19Z"
message: Release reconciliation succeeded
reason: ReconciliationSucceeded
status: "True"
type: Ready
- lastTransitionTime: "2023-05-17T20:38:27Z"
message: Helm upgrade succeeded
reason: UpgradeSucceeded
status: "True"
type: Released
helmChart: flux-system/hashicorp-vault
lastAppliedRevision: 0.19.0
lastAttemptedRevision: 0.19.0
lastAttemptedValuesChecksum: 5635480e74c3e338b6741546341da74945b593ea
lastReleaseRevision: 14
observedGeneration: 5
I'm also happy to help debug this if that's of any use.
Thanks for looking into it!
Is the patch diff part of a Flux kustomization file?
https://github.com/fluxcd/flux2/pull/4062
You could be hitting this issue, which is fixed after 2.0.1 @ebachle - could you take a look at this and see if it sounds like your same issue?
The original report is from a very old version, unless we have an active reporter after 2.0.1 I think we should close it. If you can read the description of the linked PR, confirm the location of the patch, and check briefly if you think you should be using kustomization-file
try that, and let us know if it solves your issue, then I think we can close this.
Hey @kingdonb, this is going to be a bit of a long answer, but here's what I eventually found out.
No guarantees on this, but I almost certain the version we installed the HelmRelease
with was kustomize-controller:v0.21.1
. We're currently running kustomize-controller:v0.35.1
, so it's possible the fix was somewhere in there.
What I ultimately have come to concude is the issue is in what fields Kubernetes thinks the kustomize-controller
maintains in the manifest.
This is a before of the stuff around managedFields
in the manifest:
apiVersion: helm.toolkit.fluxcd.io/v2beta1
kind: HelmRelease
metadata:
creationTimestamp: "2023-03-29T19:17:31Z"
finalizers:
- finalizers.fluxcd.io
generation: 5
labels:
kustomize.toolkit.fluxcd.io/name: vault
kustomize.toolkit.fluxcd.io/namespace: flux-system
managedFields:
- apiVersion: helm.toolkit.fluxcd.io/v2beta1
fieldsType: FieldsV1
fieldsV1:
f:metadata:
f:labels:
f:kustomize.toolkit.fluxcd.io/name: {}
f:kustomize.toolkit.fluxcd.io/namespace: {}
f:spec:
f:chart:
f:spec:
f:chart: {}
f:sourceRef:
f:kind: {}
f:name: {}
f:namespace: {}
f:version: {}
f:install:
f:remediation:
f:retries: {}
f:interval: {}
f:releaseName: {}
f:upgrade:
f:crds: {}
f:values: {}
manager: kustomize-controller
operation: Apply
time: "2023-05-17T20:38:24Z"
- apiVersion: helm.toolkit.fluxcd.io/v2beta1
fieldsType: FieldsV1
fieldsV1:
f:metadata:
f:finalizers:
.: {}
v:"finalizers.fluxcd.io": {}
manager: helm-controller
operation: Update
time: "2023-03-29T19:17:31Z"
- apiVersion: helm.toolkit.fluxcd.io/v2beta1
fieldsType: FieldsV1
fieldsV1:
f:status:
f:conditions: {}
f:helmChart: {}
f:lastAppliedRevision: {}
f:lastAttemptedRevision: {}
f:lastAttemptedValuesChecksum: {}
f:lastReleaseRevision: {}
f:observedGeneration: {}
manager: helm-controller
operation: Update
subresource: status
time: "2023-05-31T11:40:19Z"
name: vault
namespace: hashicorp
resourceVersion: "1585885902"
uid: b2397151-653c-46e9-8a70-fd7ee27e0977
spec:
When I apply a change that doesn't remove a field, really just making a random change (additive or mutating), but forces the kustomize-controller
to re-reckon with the file, this is what it becomes.
apiVersion: helm.toolkit.fluxcd.io/v2beta1
kind: HelmRelease
metadata:
creationTimestamp: "2023-03-29T19:17:31Z"
finalizers:
- finalizers.fluxcd.io
generation: 6
labels:
kustomize.toolkit.fluxcd.io/name: vault
kustomize.toolkit.fluxcd.io/namespace: flux-system
managedFields:
- apiVersion: helm.toolkit.fluxcd.io/v2beta1
fieldsType: FieldsV1
fieldsV1:
f:metadata:
f:labels:
f:kustomize.toolkit.fluxcd.io/name: {}
f:kustomize.toolkit.fluxcd.io/namespace: {}
f:spec:
f:chart:
f:spec:
f:chart: {}
f:sourceRef:
f:kind: {}
f:name: {}
f:namespace: {}
f:version: {}
f:install:
f:remediation:
f:retries: {}
f:interval: {}
f:releaseName: {}
f:upgrade:
f:crds: {}
f:values:
f:global:
.: {}
f:imagePullSecrets: {}
f:tlsDisable: {}
f:injector:
.: {}
f:agentImage:
.: {}
f:repository: {}
f:annotations:
.: {}
f:cluster-autoscaler.kubernetes.io/safe-to-evict: {}
f:replicas: {}
f:server:
.: {}
f:authDelegator:
.: {}
f:enabled: {}
f:dataStorage:
.: {}
f:enabled: {}
f:extraArgs: {}
f:extraVolumes: {}
f:ha:
.: {}
f:config: {}
f:disruptionBudget.maxUnavailable: {}
f:enabled: {}
f:replicas: {}
f:ingress:
.: {}
f:activeService: {}
f:annotations:
.: {}
f:external-dns.alpha.kubernetes.io/cloudflare-proxied: {}
f:kubernetes.io/ingress.class: {}
f:nginx.ingress.kubernetes.io/ssl-passthrough: {}
f:enabled: {}
f:hosts: {}
f:repository: {}
f:resources:
.: {}
f:limits:
.: {}
f:cpu: {}
f:memory: {}
f:requests:
.: {}
f:cpu: {}
f:memory: {}
f:service:
.: {}
f:annotations:
.: {}
f:ad.datadoghq.com/service.check_names: {}
f:ad.datadoghq.com/service.init_configs: {}
f:ad.datadoghq.com/service.instances: {}
f:updateStrategyType: {}
manager: kustomize-controller
operation: Apply
time: "2023-07-19T15:31:24Z"
- apiVersion: helm.toolkit.fluxcd.io/v2beta1
fieldsType: FieldsV1
fieldsV1:
f:metadata:
f:finalizers:
.: {}
v:"finalizers.fluxcd.io": {}
manager: helm-controller
operation: Update
time: "2023-03-29T19:17:31Z"
- apiVersion: helm.toolkit.fluxcd.io/v2beta1
fieldsType: FieldsV1
fieldsV1:
f:status:
f:conditions: {}
f:helmChart: {}
f:lastAppliedRevision: {}
f:lastAttemptedRevision: {}
f:lastAttemptedValuesChecksum: {}
f:lastReleaseRevision: {}
f:observedGeneration: {}
manager: helm-controller
operation: Update
subresource: status
time: "2023-07-19T15:31:24Z"
name: vault
namespace: hashicorp
resourceVersion: "1749511002"
uid: b2397151-653c-46e9-8a70-fd7ee27e0977
spec:
Namely that the values
field is only known about at the top level before and each field below it is detailed afterwards:
f:values: {}
After that point I'm able to modify the value of my image field in a separate commit and all works as expected.
I've reviewed the changelog of the kustomize-controller
from v0.21.1 to v0.35.1 and haven't seen anything that stands out to why those values weren't stored as sub-objects of that managed field.
The other possibility is maybe it's something that change between our upgrade to 1.22 and 1.23 in this time? But I'd be hard pressed to find that one either.
I'd be curious if there's any ideas, but I did regardless want to share my findings in case anyone else finds themselves in this pickle.
I'm also not sure if there's a change that could be made to force this addition of new managed fields before applying the change. But that also feels rather risky of a change in general. Especially as releases after the GA may not have this issue.
Some details on the exact change I made... I updated the chart version in one PR:
diff --git a/clusters/arryn-staging-redux/flux-components/vault/values.yaml b/clusters/arryn-staging-redux/flux-components/vault/values.yaml
index 7b375891..bca9bea2 100644
--- a/clusters/arryn-staging-redux/flux-components/vault/values.yaml
+++ b/clusters/arryn-staging-redux/flux-components/vault/values.yaml
@@ -6,7 +6,7 @@ metadata:
spec:
chart:
spec:
- version: "v0.19.0"
+ version: "v0.20.1"
values:
global:
tlsDisable: false # Enable HTTPS (uses certificates from Cert-manager)
This resulted in this diff from flux diff
. Kustomization diffing...: running dry-run
.. Kustomization diffing...: processing inventory
✓ Kustomization diffing...
► PriorityClass/global-cluster-critical drifted
metadata.labels.kustomize.toolkit.fluxcd.io/name
± value change
- consul
+ vault
► Namespace/hashicorp drifted
metadata.labels.kustomize.toolkit.fluxcd.io/name
± value change
- consul
+ vault
► HelmRelease/hashicorp/vault drifted
metadata.generation
± value change
- 8
+ 9
spec.chart.spec.version
± value change
- v0.19.0
+ v0.20.1
After that point the fields are managed as expected.
Then I made a separte PR to remove the image field I no longer want to differ from the default values. ANd this is once the values
field was managed by each sub-field:
diff --git a/clusters/arryn-staging-redux/flux-components/vault/values.yaml b/clusters/arryn-staging-redux/flux-components/vault/values.yaml
index 4fcd2b0a..ff8d1e4f 100644
--- a/clusters/arryn-staging-redux/flux-components/vault/values.yaml
+++ b/clusters/arryn-staging-redux/flux-components/vault/values.yaml
@@ -15,8 +15,6 @@ spec:
- name: dockerhub
server:
image:
repository: "public.ecr.aws/hashicorp/vault"
- tag: "1.9.3"
extraArgs: "-config=/config/vault-config/config.hcl" # Get configuration from K8s secret (provisioned by Terraform)
extraVolumes:
- type: secret
@@ -76,7 +74,6 @@ spec:
injector:
agentImage:
repository: "public.ecr.aws/hashicorp/vault"
- tag: "1.9.2"
replicas: 2
annotations:
cluster-autoscaler.kubernetes.io/safe-to-evict: "false"
At that point my flux diff
is:
. Kustomization diffing...: running dry-run
.. Kustomization diffing...: processing inventory
✓ Kustomization diffing...
► HelmRelease/hashicorp/vault drifted
metadata.generation
± value change
- 9
+ 10
spec.values.injector.agentImage
- one map entry removed:
tag: 1.9.2
spec.values.server
- one map entry removed:
tag: 1.9.3
And all things seem managed as expected, including the removal/update of the field.
I appreciate you sharing your findings! I just wanted to ensure I read the conclusion, you found that the upgrade did resolve the issue, though it sounds like you may have still had to force a change somehow to see the updated result in the end.
There were definitely updates in kustomize controller that affected how server side apply reconciles sub-structures in later versions, I'm not sure of the exact versions that would have included these changes. So long as you're able to work with the current state in GA, and since it sounds like you have (had) a repro of the issue on a version matching the report, if I understood all that correctly, then I believe based on your update we can close this issue.
Thanks again for reporting back @schmidt-i. Have I got that right?
Hi,Does this issue be resolved in 2.0.1?I'm also facing this bug in flux 0.28.5
Hi,Does this issue be resolved in 2.0.1?I'm also facing this bug in flux 0.28.5
We haven't gotten any more feedback on this issue for a couple of months now. 0.28.5 is more than 2 years old so if you would like to help here, you could upgrade to the latest Flux version and see if the issue goes away.
I'm still seeing this issue on v2.2.3
Describe the bug
Changes on a HelmRelease manifest from a Git repo are not applied by the kustomize controller nor being found by
flux diff
Steps to reproduce
Expected behavior
Changes are applied by the kustomize controller and the helm release is reconciled.
Screenshots and recordings
No response
OS / Distro
Linux
Flux version
v0.35.0
Flux check
► checking prerequisites ✔ Kubernetes 1.24.6 >=1.20.6-0 ► checking controllers ✔ helm-controller: deployment ready ► ghcr.io/fluxcd/helm-controller:v0.25.0 ✔ image-automation-controller: deployment ready ► ghcr.io/fluxcd/image-automation-controller:v0.26.0 ✔ image-reflector-controller: deployment ready ► ghcr.io/fluxcd/image-reflector-controller:v0.22.0 ✔ kustomize-controller: deployment ready ► ghcr.io/fluxcd/kustomize-controller:v0.29.0 ✔ notification-controller: deployment ready ► ghcr.io/fluxcd/notification-controller:v0.27.0 ✔ source-controller: deployment ready ► ghcr.io/fluxcd/source-controller:v0.30.0 ► checking crds ✔ alerts.notification.toolkit.fluxcd.io/v1beta1 ✔ buckets.source.toolkit.fluxcd.io/v1beta1 ✔ gitrepositories.source.toolkit.fluxcd.io/v1beta1 ✔ helmcharts.source.toolkit.fluxcd.io/v1beta1 ✔ helmreleases.helm.toolkit.fluxcd.io/v2beta1 ✔ helmrepositories.source.toolkit.fluxcd.io/v1beta1 ✔ imagepolicies.image.toolkit.fluxcd.io/v1beta1 ✔ imagerepositories.image.toolkit.fluxcd.io/v1beta1 ✔ imageupdateautomations.image.toolkit.fluxcd.io/v1beta1 ✔ kustomizations.kustomize.toolkit.fluxcd.io/v1beta2 ✔ ocirepositories.source.toolkit.fluxcd.io/v1beta2 ✔ providers.notification.toolkit.fluxcd.io/v1beta1 ✔ receivers.notification.toolkit.fluxcd.io/v1beta1 ✔ all checks passed
Git provider
GitHub (Enterprise)
Container Registry provider
No response
Additional context
The change in the HelmRelease is a removal of a multiline yaml configuration from the values section.
flux diff shows no difference in the current configuration to the applied configuration even.
Currently configured ressource in the cluster:
Configuration in the GitRepo:
As you can see the value "customCardTemplate" is no longer present. However the kustomize controller does not identify any change here.
Code of Conduct