Closed abudavis closed 1 month ago
A quick test using the e2e-test script shows this working.
I am not really used to OpenShift. Is wave allowed to watch the ConfigMaps of the namespace "ace"?
@toelke Thanks for looking into this. I am not sure how to check that. Wave is in namespace "wave", the secret is in namespace "ace", whereas the deployment that does not mount/use that secret is in namespace "utilities". I haven't used the "--namespaces=" option to limit it, so I have kind of assumed the helm chart has the RBAC needed to make this work, comments?
I have kind of assumed the helm chart has the RBAC needed to make this work
I assumed the same, but needed to lean on your OpenShift experience to confirm that that is supposed to work with OpenShift. I fear I need to break off my investigation soon; I will pick it up tomorrow.
My WIP in changing the tests is here: https://github.com/wave-k8s/wave/commit/748a7ac325ec8bab3ddfc9beddc5f91413e0b1a8
When you run it, it will stop at "Waiting for test to complete" until you change/create either the ConfigMap or Secret "test/test".
@toelke I checked the cluster role and clusterrolebinding and deleted it to check if wave complains and got a ton of RBAC errors in the pod, which was good, so likely the RBAC is fine.
$ oc get clusterrole wave-wave
NAME CREATED AT
wave-wave 2024-10-10T11:35:13Z
$ oc get clusterrolebinding wave-wave -o yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
annotations:
meta.helm.sh/release-name: wave
meta.helm.sh/release-namespace: wave
creationTimestamp: "2024-10-10T11:35:13Z"
labels:
app: wave
app.kubernetes.io/managed-by: Helm
heritage: Helm
release: wave
name: wave-wave
resourceVersion: "194627029"
uid: 4c730d61-8f45-4519-88c3-9e384cc85094
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: wave-wave
subjects:
- kind: ServiceAccount
name: wave-wave
namespace: wave
$ oc get sa wave-wave
NAME SECRETS AGE
wave-wave 1 4h44m
$ oc get sa wave-wave -o yaml
apiVersion: v1
imagePullSecrets:
- name: wave-wave-dockercfg-w74wx
kind: ServiceAccount
metadata:
annotations:
meta.helm.sh/release-name: wave
meta.helm.sh/release-namespace: wave
creationTimestamp: "2024-10-10T11:35:13Z"
labels:
app: wave
app.kubernetes.io/managed-by: Helm
heritage: Helm
release: wave
name: wave-wave
namespace: wave
resourceVersion: "194627052"
uid: b10bf2a7-5dfa-4549-923b-6a3c9aa23c44
secrets:
- name: wave-wave-dockercfg-w74wx
Next, I deleted the annotations and hash and then added an annotation for secret to a non-existant secret in "ace" and then wave printed this in the log, but it didn't restart the pod which is good as probably it couldn't find the secret, but I'd say wave should have printed that in the log which it didn't.
2024-10-10T15:58:21Z INFO wave Updating instance hash {"namespace": "utilities", "name": "update-acevault", "dryRun": false, "isCreate": false, "hash": "100444e91862dd77d7ebe29f050c1e9a7f357c771e1a7b7650aae27e6a3a031d"}
43
2024-10-10T15:58:21Z DEBUG events Configuration hash updated to 100444e91862dd77d7ebe29f050c1e9a7f357c771e1a7b7650aae27e6a3a031d {"type": "Normal", "object": {"kind":"Deployment","namespace":"utilities","name":"update-acevault","uid":"a488f0db-3af1-4282-8cae-a045e394611a","apiVersion":"apps/v1","resourceVersion":"194879739"}, "reason": "ConfigChanged"}
Next, I used the a secret which is in the same namespace "utilities" as the deployment, that did not make any difference whatsoever, I tried both these combinations. So this leads me to believe that the "wave.pusher.com/extra-secrets" is not working on Openshift may be?
wave.pusher.com/extra-secrets: utilities/cpd-cli-apikey
wave.pusher.com/extra-secrets: cpd-cli-apikey
I cant set the deployment to mount the secret as that would reveal anyone with access to the pod to read the mounted secret. The pod is supposed to read the secret, do some stuff real quick and delete the secret.
$ oc get clusterrolebinding wave-wave -o yaml
oc get clusterrole
would also be interesting.
but it didn't restart the pod which is good as probably it couldn't find the secret, but I'd say wave should have printed that in the log which it didn't.
If you create the secret, wave will pick it up and restart the Pod.
I cant set the deployment to mount the secret as that would reveal anyone with access to the pod to read the mounted secret.
That is precisely what this feature is for.
Can you show the clusterrole of update-acevault-
? To compare why it can read the secret but wave can't?
How many secrets, deployments and configmaps are in your cluster over-all? Are you sure that setting SyncPeriod
to such a low value is sensible? The default is 10 hours. Note: This is not about how fast wave will normally react to changes in secrets and configmaps.
@toelke "If you create the secret, wave will pick it up and restart the Pod."? >>> The secret is already created, may be I misunderstood this, but my understanding is wave is capable of detecting an update or change of an existing secret and change/update/insert a hash on deployment at /spec/template/metadata/annotations?
oc get clusterrole
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
annotations:
meta.helm.sh/release-name: wave
meta.helm.sh/release-namespace: wave
creationTimestamp: "2024-10-10T11:35:13Z"
labels:
app: wave
app.kubernetes.io/managed-by: Helm
heritage: Helm
release: wave
name: wave-wave
resourceVersion: "195903655"
uid: ccf2db3c-1dc7-4f58-960b-c409471157d5
rules:
- apiGroups:
- ""
resources:
- configmaps
- secrets
verbs:
- list
- get
- update
- patch
- watch
- apiGroups:
- ""
resources:
- events
verbs:
- create
- update
- patch
- apiGroups:
- apps
resources:
- deployments
- daemonsets
- statefulsets
verbs:
- list
- get
- update
- patch
- watch
- apiGroups:
- coordination.k8s.io
resources:
- leases
verbs:
- '*'
ACE vault's RBAC:
apiVersion: v1
kind: ServiceAccount
metadata:
name: acevault
namespace: utilities
annotations:
argocd.argoproj.io/sync-wave: "0"
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: edit-acevault3
namespace: ace
annotations:
argocd.argoproj.io/sync-wave: "0"
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: edit
subjects:
- kind: ServiceAccount
name: acevault
namespace: utilities
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
creationTimestamp: null
name: edit-acevault4
namespace: utilities
annotations:
argocd.argoproj.io/sync-wave: "0"
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: edit
subjects:
- kind: ServiceAccount
name: acevault
namespace: utilities
I am not sure how many objects we have, but we have over 2000 pods running on the Openshift cluster. SyncPeriod: I kind of assumed letting it to default 10 hour would mean it would take 10 hours for wave to detect that the secret has changed and trigger the deployment? What does this setting mean really?
The secret is already created, may be I misunderstood this, but my understanding is wave is capable of detecting an update or change of an existing secret
It is also capable of detecting the creation of a new secret. I was referencing this test you did:
Next, I deleted the annotations and hash and then added an annotation for secret to a non-existant secret in "ace" and then wave printed this in the log, but it didn't restart the pod which is good as probably it couldn't find the secret,
[SyncPeriod]: What does this setting mean really?
https://github.com/kubernetes-sigs/controller-runtime/blob/v0.17.2/pkg/cache/cache.go#L146-L171 It is mainly to work around possible bugs where the watch-stream loses updates; in effect, this is a worst-case reaction time for wave.
@toelke Is it possible to increase the log levels to see what's happening in the background for the wave pod? I am out of options to figure out why this doesn't work on our cluster.
A number of interesting code-paths are not logging; I might work on that.
To make sure: All Secrets and ConfigMaps exist? All mounted ones and all referenced by annotation? If any (non-optional) is missing, wave will not do anything until the full set exists.
@toelke In our implementation, its just a secret update, no ConfigMap is involved. Oh wait, does the secret need to be annotated as well?
Can you try running the image quay.io/wave-k8s/wave:v0.9.0-extra-logging
?
It would print something like
2024-10-11T09:11:26Z INFO wave All children found {"namespace": "default", "name": "test", "configMaps": "map[default/test:&ConfigMap{...} test/test:&ConfigMap{...}]", "secrets": "map[test/test:&Secret{...}]"}
Showing that wave found two configmaps (default/test
and test/test
) and one secret (test/test
).
@toelke That image works! Now the deployment is patched when the secret is changed, cool! So it was the image I guess?
This image does not work as intended: quay.io/wave-k8s/wave:v0.8.0
2024-10-11T10:28:38Z INFO wave Updating instance hash {"namespace": "utilities", "name": "update-acevault", "dryRun": false, "isCreate": false, "hash": "78949eeabaeb36d55ee38257b48ffc695c7fb925ed7cd2989efd272868e5e574"}
2024-10-11T10:28:38Z DEBUG events Configuration hash updated to 78949eeabaeb36d55ee38257b48ffc695c7fb925ed7cd2989efd272868e5e574 {"type": "Normal", "object": {"kind":"Deployment","namespace":"utilities","name":"update-acevault","uid":"a488f0db-3af1-4282-8cae-a045e394611a","apiVersion":"apps/v1","resourceVersion":"196045962"}, "reason": "ConfigChanged"}
2024-10-11T10:28:38Z INFO wave All children found {"namespace": "utilities", "name": "update-acevault", "configMaps": "map[]", "secrets": "map[ace/mqsicredentials:&Secret{ObjectMeta:{mqsicredentials ace 15a44b96-624c-456b-9ee9-62463740d509 194972281 0 2024-04-15 20:19:26 +0000 UTC <nil> <nil> map[] map[] [] [] [{Mozilla Update v1 2024-10-10 17:07:28 +0000 UTC FieldsV1 {\"f:data\":{\".\":{},\"f:key\":{}},\"f:type\":{}} }]},Data:map[string][]byte{key: [109 113 115 105 99 114 101 100 101 110 116 105 97 108 115 32 45 45 119 111 114 107 45 100 105 114 32 47 104 111 109 101 47 97 99 101 117 115 101 114
I changed the image to "quay.io/wave-k8s/wave:v0.9.0" and it works perfectly without the extra logs! So I guess you might want to update your helm chart?
2024-10-11T10:40:10Z INFO wave Updating instance hash {"namespace": "utilities", "name": "update-acevault", "hash": "d693ab05fdfce3671763971a03921ea13164dca9bfe2f5d14f7fd41ba6f3b3e7"}
2024-10-11T10:40:10Z DEBUG events Configuration hash updated to d693ab05fdfce3671763971a03921ea13164dca9bfe2f5d14f7fd41ba6f3b3e7 {"type": "Normal", "object": {"kind":"Deployment","namespace":"utilities","name":"update-acevault","uid":"a488f0db-3af1-4282-8cae-a045e394611a","apiVersion":"apps/v1","resourceVersion":"196049884"}, "reason": "ConfigChanged"}
Oh. I did not correctly release 0.9.0 to helm :facepalm:
Try chart version 4.3.0.
I am glad this is solved which also means you can now be certain it works on Openshift 4.x too :) Thank you for all the help!
Wave version: latest as per helm chart Openshift: v4.14.31
Install commands used:
We are trying to get wave to restart a pod from a Deployment, I set the following annotations and changed the secret "mqsicredentials" in "ace" namespace, but nothing happened. Wave is deployed in namespace "wave", the Deployment is deployed in namespace "utilities" and it does not mount the secret "mqsicredentials" as that's not needed & is anyway in a different namespace "ace" anyway.
The "updating instance hash" in the log below came in when I set the above annotations on the deployment (at 11:42 UTC shown in the logs) for the first time at which point it did restart the pod, but after when I updated the secret and waited for much longer than 5 minutes, nothing happens. I have now made multiple attempts to update the secret and nothing! Please help.
Deployment's Pod where the hash annotation has been inserted by wave:
The wave pod logs are as follows. There is also a mutatingwebhookconfiguration for wave, unsure how to check if that works.