wave-k8s / wave

Kubernetes configuration tracking controller
Apache License 2.0
681 stars 81 forks source link

RBAC issue on Openshift 4.14.31 #174

Closed abudavis closed 1 month ago

abudavis commented 1 month ago

Wave version: latest as per helm chart Openshift: v4.14.31

Install commands used:

helm repo add wave-k8s https://wave-k8s.github.io/wave/
helm install wave wave-k8s/wave --namespace wave --set syncPeriod=5m --set webhooks.enabled=true --create-namespace

We have an install issue and it looks like an RBAC problem, please suggest a way to fix this, following is the error in the Deployment yaml:

message: 'pods "wave-wave-d9df65dc5-" is forbidden: unable to validate against any security context constraint: [provider "anyuid": Forbidden: not usable by user or serviceaccount, provider "pipelines-scc": Forbidden: not usable by user or serviceaccount, provider "splunk-otel-agent-splunk-otel-collector": Forbidden: not usable by user or serviceaccount, provider restricted-v2: .containers[0].runAsUser: Invalid value: 1000: must be in the ranges: [1000760000, 1000769999], provider "restricted": Forbidden: not usable by user or serviceaccount, provider "nonroot-v2": Forbidden: not usable by user or serviceaccount, provider "nonroot": Forbidden: not usable by user or serviceaccount, provider "hostmount-anyuid": Forbidden: not usable by user or serviceaccount, provider "machine-api-termination-handler": Forbidden: not usable by user or serviceaccount, provider "hostnetwork-v2": Forbidden: not usable by user or serviceaccount, provider "hostnetwork": Forbidden: not usable by user or serviceaccount, provider "hostaccess": Forbidden: not usable by user or serviceaccount, provider "node-exporter": Forbidden: not usable by user or serviceaccount, provider "privileged": Forbidden: not usable by user or serviceaccount]'

Service account:

apiVersion: v1
imagePullSecrets:
- name: wave-wave-dockercfg-8sjcn
kind: ServiceAccount
metadata:
  annotations:
    meta.helm.sh/release-name: wave
    meta.helm.sh/release-namespace: wave
  creationTimestamp: "2024-10-10T10:38:10Z"
  labels:
    app: wave
    app.kubernetes.io/managed-by: Helm
    heritage: Helm
    release: wave
  name: wave-wave
  namespace: wave
  resourceVersion: "194567755"
  uid: 36f2461d-2be8-4ddb-ac28-b193e52b0ebd
secrets:
- name: wave-wave-dockercfg-8sjcn

Deployment yaml:

kind: Deployment
apiVersion: apps/v1
metadata:
  annotations:
    deployment.kubernetes.io/revision: '2'
    meta.helm.sh/release-name: wave
    meta.helm.sh/release-namespace: wave
  name: wave-wave
  namespace: wave
  labels:
    app: wave
    app.kubernetes.io/managed-by: Helm
    certmanager.k8s.io/time-restarted: 2024-10-10.103811
    heritage: Helm
    release: wave
spec:
  replicas: 1
  selector:
    matchLabels:
      app: wave
      heritage: Helm
      release: wave
  template:
    metadata:
      creationTimestamp: null
      labels:
        app: wave
        certmanager.k8s.io/time-restarted: 2024-10-10.103811
        heritage: Helm
        release: wave
    spec:
      restartPolicy: Always
      serviceAccountName: wave-wave
      schedulerName: default-scheduler
      affinity: {}
      terminationGracePeriodSeconds: 30
      securityContext:
        runAsUser: 1000
        runAsNonRoot: true
      containers:
        - resources: {}
          terminationMessagePath: /dev/termination-log
          name: wave-wave
          ports:
            - name: webhook-server
              containerPort: 9443
              protocol: TCP
          imagePullPolicy: IfNotPresent
          volumeMounts:
            - name: cert
              readOnly: true
              mountPath: /tmp/k8s-webhook-server/serving-certs
          terminationMessagePolicy: File
          image: 'quay.io/wave-k8s/wave:v0.8.0'
          args:
            - '--sync-period=5m'
            - '--enable-webhooks=true'
      serviceAccount: wave-wave
      volumes:
        - name: cert
          secret:
            secretName: wave-wave-webhook-server-cert
            defaultMode: 420
      dnsPolicy: ClusterFirst
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxUnavailable: 25%
      maxSurge: 25%
  revisionHistoryLimit: 10
  progressDeadlineSeconds: 600
status:
  observedGeneration: 4
  unavailableReplicas: 1
  conditions:
    - type: Progressing
      status: 'True'
      lastUpdateTime: '2024-10-10T10:39:10Z'
      lastTransitionTime: '2024-10-10T10:38:11Z'
      reason: NewReplicaSetAvailable
      message: ReplicaSet "wave-wave-d9df65dc5" has successfully progressed.
    - type: Available
      status: 'False'
      lastUpdateTime: '2024-10-10T10:39:11Z'
      lastTransitionTime: '2024-10-10T10:39:11Z'
      reason: MinimumReplicasUnavailable
      message: Deployment does not have minimum availability.
    - type: ReplicaFailure
      status: 'True'
      lastUpdateTime: '2024-10-10T10:39:11Z'
      lastTransitionTime: '2024-10-10T10:39:11Z'
      reason: FailedCreate
      message: 'pods "wave-wave-d9df65dc5-" is forbidden: unable to validate against any security context constraint: [provider "anyuid": Forbidden: not usable by user or serviceaccount, provider "pipelines-scc": Forbidden: not usable by user or serviceaccount, provider "splunk-otel-agent-splunk-otel-collector": Forbidden: not usable by user or serviceaccount, provider restricted-v2: .containers[0].runAsUser: Invalid value: 1000: must be in the ranges: [1000760000, 1000769999], provider "restricted": Forbidden: not usable by user or serviceaccount, provider "nonroot-v2": Forbidden: not usable by user or serviceaccount, provider "nonroot": Forbidden: not usable by user or serviceaccount, provider "hostmount-anyuid": Forbidden: not usable by user or serviceaccount, provider "machine-api-termination-handler": Forbidden: not usable by user or serviceaccount, provider "hostnetwork-v2": Forbidden: not usable by user or serviceaccount, provider "hostnetwork": Forbidden: not usable by user or serviceaccount, provider "hostaccess": Forbidden: not usable by user or serviceaccount, provider "node-exporter": Forbidden: not usable by user or serviceaccount, provider "privileged": Forbidden: not usable by user or serviceaccount]'
toelke commented 1 month ago

Hey!

You could try setting spec.template.spec.runAsUser to a value allowed by your SCC: Invalid value: 1000: must be in the ranges: [1000760000, 1000769999]. You can do that with helm by setting securityContext.runAsUser.

abudavis commented 1 month ago

@toelke That fixed the problem. Btw, do you know how to set a CPU and Memory limit for the wave pod via helm commands?

toelke commented 1 month ago

You can set resources: https://github.com/wave-k8s/wave/blob/master/charts/wave/values.yaml#L56

abudavis commented 1 month ago

@toelke: We are trying to get wave to restart a pod from a Deployment, I set the following annotations and changed the secret "mqsicredentials" in "ace" namespace, but nothing happened. Wave is deployed in namespace "wave", the Deployment is deployed in namespace "utilities" and it does not mount the secret "mqsicredentials" as that's not needed & is anyway in a different namespace "ace" anyway.

How to get this to work? Am I doing something fundamentally wrong?

kind: Deployment
apiVersion: apps/v1
metadata:
  annotations:
    wave.pusher.com/extra-secrets: ace/mqsicredentials
    wave.pusher.com/update-on-config-change: 'true'
  name: update-acevault
...

The wave pod logs are as follows. There is also a mutatingwebhookconfiguration for wave, unsure how to check if that works.

2024-10-10T11:35:20Z    INFO    setup   setting up client for manager
2024-10-10T11:35:20Z    INFO    setup   setting up manager
2024-10-10T11:35:20Z    INFO    setup   Registering Components.
2024-10-10T11:35:20Z    INFO    setup   setting up scheme
2024-10-10T11:35:20Z    INFO    setup   Setting up controller
2024-10-10T11:35:20Z    INFO    controller-runtime.builder  Registering a mutating webhook  {"GVK": "apps/v1, Kind=Deployment", "path": "/mutate-apps-v1-deployment"}
2024-10-10T11:35:20Z    INFO    controller-runtime.webhook  Registering webhook {"path": "/mutate-apps-v1-deployment"}
2024-10-10T11:35:20Z    INFO    controller-runtime.builder  skip registering a validating webhook, object does not implement admission.Validator or WithValidator wasn't called {"GVK": "apps/v1, Kind=Deployment"}
2024-10-10T11:35:20Z    INFO    controller-runtime.builder  Registering a mutating webhook  {"GVK": "apps/v1, Kind=StatefulSet", "path": "/mutate-apps-v1-statefulset"}
2024-10-10T11:35:20Z    INFO    controller-runtime.webhook  Registering webhook {"path": "/mutate-apps-v1-statefulset"}
2024-10-10T11:35:20Z    INFO    controller-runtime.builder  skip registering a validating webhook, object does not implement admission.Validator or WithValidator wasn't called {"GVK": "apps/v1, Kind=StatefulSet"}
2024-10-10T11:35:20Z    INFO    controller-runtime.builder  Registering a mutating webhook  {"GVK": "apps/v1, Kind=DaemonSet", "path": "/mutate-apps-v1-daemonset"}
2024-10-10T11:35:20Z    INFO    controller-runtime.webhook  Registering webhook {"path": "/mutate-apps-v1-daemonset"}
2024-10-10T11:35:20Z    INFO    controller-runtime.builder  skip registering a validating webhook, object does not implement admission.Validator or WithValidator wasn't called {"GVK": "apps/v1, Kind=DaemonSet"}
2024-10-10T11:35:20Z    INFO    setup   Starting the Cmd.
2024-10-10T11:35:20Z    INFO    controller-runtime.metrics  Starting metrics server
2024-10-10T11:35:20Z    INFO    controller-runtime.webhook  Starting webhook server
2024-10-10T11:35:20Z    INFO    Starting EventSource    {"controller": "statefulset-controller", "source": "kind source: *v1.StatefulSet"}
2024-10-10T11:35:20Z    INFO    Starting EventSource    {"controller": "deployment-controller", "source": "kind source: *v1.Deployment"}
2024-10-10T11:35:20Z    INFO    controller-runtime.metrics  Serving metrics server  {"bindAddress": ":8080", "secure": false}
2024-10-10T11:35:20Z    INFO    Starting EventSource    {"controller": "deployment-controller", "source": "kind source: *v1.ConfigMap"}
2024-10-10T11:35:20Z    INFO    Starting EventSource    {"controller": "statefulset-controller", "source": "kind source: *v1.ConfigMap"}
2024-10-10T11:35:20Z    INFO    Starting EventSource    {"controller": "deployment-controller", "source": "kind source: *v1.Secret"}
2024-10-10T11:35:20Z    INFO    Starting Controller {"controller": "deployment-controller"}
2024-10-10T11:35:20Z    INFO    Starting EventSource    {"controller": "statefulset-controller", "source": "kind source: *v1.Secret"}
2024-10-10T11:35:20Z    INFO    Starting Controller {"controller": "statefulset-controller"}
2024-10-10T11:35:20Z    INFO    Starting EventSource    {"controller": "daemonset-controller", "source": "kind source: *v1.DaemonSet"}
2024-10-10T11:35:20Z    INFO    Starting EventSource    {"controller": "daemonset-controller", "source": "kind source: *v1.ConfigMap"}
2024-10-10T11:35:20Z    INFO    Starting EventSource    {"controller": "daemonset-controller", "source": "kind source: *v1.Secret"}
2024-10-10T11:35:20Z    INFO    Starting Controller {"controller": "daemonset-controller"}
2024-10-10T11:35:20Z    INFO    controller-runtime.certwatcher  Updated current TLS certificate
2024-10-10T11:35:20Z    INFO    controller-runtime.webhook  Serving webhook server  {"host": "", "port": 9443}
2024-10-10T11:35:20Z    INFO    controller-runtime.certwatcher  Starting certificate watcher
2024-10-10T11:35:20Z    INFO    Starting workers    {"controller": "statefulset-controller", "worker count": 1}
2024-10-10T11:35:20Z    INFO    Starting workers    {"controller": "deployment-controller", "worker count": 1}
2024-10-10T11:35:20Z    INFO    Starting workers    {"controller": "daemonset-controller", "worker count": 1}
2024/10/10 11:38:28 http: TLS handshake error from 10.128.4.2:46762: EOF
2024-10-10T11:42:21Z    INFO    wave    Updating instance hash  {"namespace": "utilities", "name": "update-acevault", "dryRun": false, "isCreate": false, "hash": "100444e91862dd77d7ebe29f050c1e9a7f357c771e1a7b7650aae27e6a3a031d"}
2024-10-10T11:42:21Z    DEBUG   events  Configuration hash updated to 100444e91862dd77d7ebe29f050c1e9a7f357c771e1a7b7650aae27e6a3a031d  {"type": "Normal", "object": {"kind":"Deployment","namespace":"utilities","name":"update-acevault","uid":"a488f0db-3af1-4282-8cae-a045e394611a","apiVersion":"apps/v1","resourceVersion":"188476560"}, "reason": "ConfigChanged"}
2024/10/10 11:53:28 http: TLS handshake error from 10.128.4.2:36522: EOF
toelke commented 1 month ago

This looks like it is working. When in the log did you change the secret?

abudavis commented 1 month ago

@toelke Its not working, the "updating instance hash" in the log came in when I set the annotation on the deployment (at 11:42 UTC shown in the logs) for the first time at which point it did restart the pod, but after when I updated the secret and waited for much longer than 5 minutes, nothing happens.