kubewarden / policy-server

Webhook server that evaluates WebAssembly policies to validate Kubernetes requests
https://kubewarden.io
Apache License 2.0
138 stars 18 forks source link

Install process via Rancher Fleet #394

Closed eumel8 closed 1 year ago

eumel8 commented 1 year ago

Is there an existing issue for this?

Current Behavior

In Kubernetes 1.20.15 cluster I try to install Kubewarden in a namespace with restricted PSP. I used Helm charts kubewarden-controller 1.2.8, kubewarden-crds 1.2.3, kubewarden-defaults 1.2.8. with kubewarden-controller v1.4.0 and kubewarden-policy-server v1.4.0. cert-manager 1.10.1 The controller seems up & running, but the policy server dies:

1.6724053021486528e+09  INFO    controller-runtime.metrics  Metrics server is starting to listen    {"addr": ":8088"}
1.6724053021492136e+09  INFO    controller-runtime.builder  Registering a mutating webhook  {"GVK": "policies.kubewarden.io/v1, Kind=PolicyServer", "path": "/mutate-policies-kubewarden-io-v1-policyserver"}
1.6724053021493063e+09  INFO    controller-runtime.webhook  Registering webhook {"path": "/mutate-policies-kubewarden-io-v1-policyserver"}
1.6724053021493528e+09  INFO    controller-runtime.builder  skip registering a validating webhook, object does not implement admission.Validator or WithValidator wasn't called {"GVK": "policies.kubewarden.io/v1, Kind=PolicyServer"}
1.6724053021493993e+09  INFO    controller-runtime.builder  Registering a mutating webhook  {"GVK": "policies.kubewarden.io/v1, Kind=ClusterAdmissionPolicy", "path": "/mutate-policies-kubewarden-io-v1-clusteradmissionpolicy"}
1.6724053021494603e+09  INFO    controller-runtime.webhook  Registering webhook {"path": "/mutate-policies-kubewarden-io-v1-clusteradmissionpolicy"}
1.6724053021495e+09 INFO    controller-runtime.builder  Registering a validating webhook    {"GVK": "policies.kubewarden.io/v1, Kind=ClusterAdmissionPolicy", "path": "/validate-policies-kubewarden-io-v1-clusteradmissionpolicy"}
1.6724053021495345e+09  INFO    controller-runtime.webhook  Registering webhook {"path": "/validate-policies-kubewarden-io-v1-clusteradmissionpolicy"}
1.672405302149621e+09   INFO    controller-runtime.builder  Registering a mutating webhook  {"GVK": "policies.kubewarden.io/v1, Kind=AdmissionPolicy", "path": "/mutate-policies-kubewarden-io-v1-admissionpolicy"}
1.6724053021496496e+09  INFO    controller-runtime.webhook  Registering webhook {"path": "/mutate-policies-kubewarden-io-v1-admissionpolicy"}
1.6724053021496844e+09  INFO    controller-runtime.builder  Registering a validating webhook    {"GVK": "policies.kubewarden.io/v1, Kind=AdmissionPolicy", "path": "/validate-policies-kubewarden-io-v1-admissionpolicy"}
1.672405302149716e+09   INFO    controller-runtime.webhook  Registering webhook {"path": "/validate-policies-kubewarden-io-v1-admissionpolicy"}
1.672405302150011e+09   INFO    setup   starting manager
1.67240530215053e+09    INFO    controller-runtime.webhook.webhooks Starting webhook server
1.672405302150613e+09   INFO    Starting server {"path": "/metrics", "kind": "metrics", "addr": "[::]:8088"}
1.6724053021507907e+09  INFO    Starting server {"kind": "health probe", "addr": "[::]:8081"}
1.6724053022523959e+09  INFO    Stopping and waiting for non leader election runnables
1.6724053022525866e+09  INFO    Stopping and waiting for leader election runnables
1.67240530225253e+09    INFO    Starting EventSource    {"controller": "policyserver", "controllerGroup": "policies.kubewarden.io", "controllerKind": "PolicyServer", "source": "kind source: *v1.PolicyServer"}
1.672405302252617e+09   INFO    Starting EventSource    {"controller": "policyserver", "controllerGroup": "policies.kubewarden.io", "controllerKind": "PolicyServer", "source": "kind source: *v1.AdmissionPolicy"}
1.6724053022526267e+09  INFO    Starting EventSource    {"controller": "policyserver", "controllerGroup": "policies.kubewarden.io", "controllerKind": "PolicyServer", "source": "kind source: *v1.ClusterAdmissionPolicy"}
1.6724053022526097e+09  INFO    Starting EventSource    {"controller": "admissionpolicy", "controllerGroup": "policies.kubewarden.io", "controllerKind": "AdmissionPolicy", "source": "kind source: *v1.AdmissionPolicy"}
1.6724053022526398e+09  INFO    Starting EventSource    {"controller": "admissionpolicy", "controllerGroup": "policies.kubewarden.io", "controllerKind": "AdmissionPolicy", "source": "kind source: *v1.Pod"}
1.6724053022526493e+09  INFO    Starting EventSource    {"controller": "admissionpolicy", "controllerGroup": "policies.kubewarden.io", "controllerKind": "AdmissionPolicy", "source": "kind source: *v1.PolicyServer"}
1.6724053022526536e+09  INFO    Starting Controller {"controller": "admissionpolicy", "controllerGroup": "policies.kubewarden.io", "controllerKind": "AdmissionPolicy"}
1.6724053022526634e+09  INFO    Starting workers    {"controller": "admissionpolicy", "controllerGroup": "policies.kubewarden.io", "controllerKind": "AdmissionPolicy", "worker count": 1}
1.6724053022526767e+09  INFO    Shutdown signal received, waiting for all workers to finish {"controller": "admissionpolicy", "controllerGroup": "policies.kubewarden.io", "controllerKind": "AdmissionPolicy"}
1.6724053022526317e+09  INFO    Starting Controller {"controller": "policyserver", "controllerGroup": "policies.kubewarden.io", "controllerKind": "PolicyServer"}
1.6724053022526891e+09  INFO    Starting workers    {"controller": "policyserver", "controllerGroup": "policies.kubewarden.io", "controllerKind": "PolicyServer", "worker count": 1}
1.672405302252693e+09   INFO    Shutdown signal received, waiting for all workers to finish {"controller": "policyserver", "controllerGroup": "policies.kubewarden.io", "controllerKind": "PolicyServer"}
1.6724053022526965e+09  INFO    All workers finished    {"controller": "policyserver", "controllerGroup": "policies.kubewarden.io", "controllerKind": "PolicyServer"}
1.6724053022527058e+09  INFO    All workers finished    {"controller": "admissionpolicy", "controllerGroup": "policies.kubewarden.io", "controllerKind": "AdmissionPolicy"}
1.6724053022527063e+09  INFO    Starting EventSource    {"controller": "clusteradmissionpolicy", "controllerGroup": "policies.kubewarden.io", "controllerKind": "ClusterAdmissionPolicy", "source": "kind source: *v1.ClusterAdmissionPolicy"}
1.672405302252735e+09   INFO    Starting EventSource    {"controller": "clusteradmissionpolicy", "controllerGroup": "policies.kubewarden.io", "controllerKind": "ClusterAdmissionPolicy", "source": "kind source: *v1.Pod"}
1.6724053022527413e+09  INFO    Starting EventSource    {"controller": "clusteradmissionpolicy", "controllerGroup": "policies.kubewarden.io", "controllerKind": "ClusterAdmissionPolicy", "source": "kind source: *v1.PolicyServer"}
1.6724053022527454e+09  INFO    Starting Controller {"controller": "clusteradmissionpolicy", "controllerGroup": "policies.kubewarden.io", "controllerKind": "ClusterAdmissionPolicy"}
1.6724053022527523e+09  INFO    Starting workers    {"controller": "clusteradmissionpolicy", "controllerGroup": "policies.kubewarden.io", "controllerKind": "ClusterAdmissionPolicy", "worker count": 1}
1.6724053022527595e+09  INFO    Shutdown signal received, waiting for all workers to finish {"controller": "clusteradmissionpolicy", "controllerGroup": "policies.kubewarden.io", "controllerKind": "ClusterAdmissionPolicy"}
1.6724053022527869e+09  INFO    All workers finished    {"controller": "clusteradmissionpolicy", "controllerGroup": "policies.kubewarden.io", "controllerKind": "ClusterAdmissionPolicy"}
1.6724053022527933e+09  INFO    Stopping and waiting for caches
1.6724053022529182e+09  INFO    Stopping and waiting for webhooks
1.6724053022529316e+09  INFO    Wait completed, proceeding to shutdown the manager
1.6724053022529082e+09  ERROR   controller-runtime.source   failed to get informer from cache   {"error": "Timeout: failed waiting for *v1.PolicyServer Informer to sync"}
sigs.k8s.io/controller-runtime/pkg/source.(*Kind).Start.func1.1
    /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.13.0/pkg/source/source.go:144
k8s.io/apimachinery/pkg/util/wait.runConditionWithCrashProtectionWithContext
    /go/pkg/mod/k8s.io/apimachinery@v0.25.3/pkg/util/wait/wait.go:235
k8s.io/apimachinery/pkg/util/wait.poll
    /go/pkg/mod/k8s.io/apimachinery@v0.25.3/pkg/util/wait/wait.go:582
k8s.io/apimachinery/pkg/util/wait.PollImmediateUntilWithContext
    /go/pkg/mod/k8s.io/apimachinery@v0.25.3/pkg/util/wait/wait.go:547
sigs.k8s.io/controller-runtime/pkg/source.(*Kind).Start.func1
    /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.13.0/pkg/source/source.go:132
1.6724053022529385e+09  ERROR   setup   problem running manager {"error": "open /tmp/k8s-webhook-server/serving-certs/tls.crt: no such file or directory"}
main.main
    /workspace/main.go:205
runtime.main
    /usr/local/go/src/runtime/proc.go:250

describe pod:

kubectl -n kubewarden describe pod policy-server-default-5765c5dbd-vvbk8 
Name:         policy-server-default-5765c5dbd-vvbk8
Namespace:    kubewarden
Priority:     0
Node:         vm-frank-test-k8s-02-ranchernode-2/10.9.3.164
Start Time:   Fri, 30 Dec 2022 13:45:17 +0100
Labels:       app=kubewarden-policy-server-default
              kubewarden/config-version=24636276
              kubewarden/policy-server=default
              pod-template-hash=5765c5dbd
Annotations:  cni.projectcalico.org/podIP: 10.42.6.10/32
              cni.projectcalico.org/podIPs: 10.42.6.10/32
              kubernetes.io/psp: restricted
              seccomp.security.alpha.kubernetes.io/pod: runtime/default
Status:       Running
IP:           10.42.6.10
IPs:
  IP:           10.42.6.10
Controlled By:  ReplicaSet/policy-server-default-5765c5dbd
Containers:
  policy-server-default:
    Container ID:   docker://0b3fcc62396f7058fd8fefb7d37370a5fcac94aa4291ce295e79d4878d2578a4
    Image:          mtr.devops.telekom.de/kubewarden/kubewarden-policy-server:v1.4.0
    Image ID:       docker-pullable://mtr.devops.telekom.de/kubewarden/kubewarden-policy-server@sha256:249f2f0554e5213f4cae2bf43e44bf3332f8d287fe2831430dd638325f0dbfb2
    Port:           <none>
    Host Port:      <none>
    State:          Waiting
      Reason:       CrashLoopBackOff
    Last State:     Terminated
      Reason:       Error
      Exit Code:    1
      Started:      Fri, 30 Dec 2022 14:06:51 +0100
      Finished:     Fri, 30 Dec 2022 14:06:52 +0100
    Ready:          False
    Restart Count:  9
    Readiness:      http-get https://:8443/readiness delay=0s timeout=1s period=10s #success=1 #failure=3
    Environment:
      KUBEWARDEN_CERT_FILE:                                     /pki/policy-server-cert
      KUBEWARDEN_KEY_FILE:                                      /pki/policy-server-key
      KUBEWARDEN_PORT:                                          8443
      KUBEWARDEN_POLICIES_DOWNLOAD_DIR:                         /tmp
      KUBEWARDEN_POLICIES:                                      /config/policies.yml
      KUBEWARDEN_SIGSTORE_CACHE_DIR:                            /tmp/sigstore-data
      KUBEWARDEN_LOG_LEVEL:                                     info
      KUBEWARDEN_ALWAYS_ACCEPT_ADMISSION_REVIEWS_ON_NAMESPACE:  kubewarden
    Mounts:
      /config from policies (ro)
      /pki from certs (ro)
      /tmp from policy-store (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from policy-server-token-xxcgd (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             False 
  ContainersReady   False 
  PodScheduled      True 
Volumes:
  policy-store:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:     
    SizeLimit:  <unset>
  certs:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  policy-server-default
    Optional:    false
  policies:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      policy-server-default
    Optional:  false
  policy-server-token-xxcgd:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  policy-server-token-xxcgd
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                 node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason     Age                  From               Message
  ----     ------     ----                 ----               -------
  Normal   Scheduled  25m                  default-scheduler  Successfully assigned kubewarden/policy-server-default-5765c5dbd-vvbk8 to vm-frank-test-k8s-02-ranchernode-2
  Normal   Pulled     25m                  kubelet            Successfully pulled image "mtr.devops.telekom.de/kubewarden/kubewarden-policy-server:v1.4.0" in 2.157525883s
  Normal   Pulled     25m                  kubelet            Successfully pulled image "mtr.devops.telekom.de/kubewarden/kubewarden-policy-server:v1.4.0" in 1.779984459s
  Normal   Pulled     25m                  kubelet            Successfully pulled image "mtr.devops.telekom.de/kubewarden/kubewarden-policy-server:v1.4.0" in 196.882324ms
  Normal   Started    24m (x4 over 25m)    kubelet            Started container policy-server-default
  Normal   Pulled     24m                  kubelet            Successfully pulled image "mtr.devops.telekom.de/kubewarden/kubewarden-policy-server:v1.4.0" in 234.828236ms
  Normal   Pulling    24m (x5 over 25m)    kubelet            Pulling image "mtr.devops.telekom.de/kubewarden/kubewarden-policy-server:v1.4.0"
  Normal   Created    23m (x5 over 25m)    kubelet            Created container policy-server-default
  Normal   Pulled     23m                  kubelet            Successfully pulled image "mtr.devops.telekom.de/kubewarden/kubewarden-policy-server:v1.4.0" in 245.639765ms
  Warning  BackOff    35s (x114 over 25m)  kubelet            Back-off restarting failed container

just wondering, the cert is refered in the controller so I manually added them to the policy server deployment:

...
        volumeMounts:
        - mountPath: /tmp/k8s-webhook-server/serving-certs
          name: cert
          readOnly: true
...
      volumes:
      - name: cert
        secret:
          defaultMode: 420
          secretName: webhook-server-cert

This would fix the missing cert file issue, but now other errors appeared after restart:

W1230 13:17:24.874587       1 reflector.go:424] pkg/mod/k8s.io/client-go@v0.25.3/tools/cache/reflector.go:169: failed to list *v1.PolicyServer: policyservers.policies.kubewarden.io is forbidden: User "system:serviceaccount:kubewarden:policy-server" cannot list resource "policyservers" in API group "policies.kubewarden.io" at the cluster scope
E1230 13:17:24.874622       1 reflector.go:140] pkg/mod/k8s.io/client-go@v0.25.3/tools/cache/reflector.go:169: Failed to watch *v1.PolicyServer: failed to list *v1.PolicyServer: policyservers.policies.kubewarden.io is forbidden: User "system:serviceaccount:kubewarden:policy-server" cannot list resource "policyservers" in API group "policies.kubewarden.io" at the cluster scope
W1230 13:17:37.800549       1 reflector.go:424] pkg/mod/k8s.io/client-go@v0.25.3/tools/cache/reflector.go:169: failed to list *v1.Pod: pods is forbidden: User "system:serviceaccount:kubewarden:policy-server" cannot list resource "pods" in API group "" at the cluster scope
E1230 13:17:37.800584       1 reflector.go:140] pkg/mod/k8s.io/client-go@v0.25.3/tools/cache/reflector.go:169: Failed to watch *v1.Pod: failed to list *v1.Pod: pods is forbidden: User "system:serviceaccount:kubewarden:policy-server" cannot list resource "pods" in API group "" at the cluster scope
W1230 13:18:09.768600       1 reflector.go:424] pkg/mod/k8s.io/client-go@v0.25.3/tools/cache/reflector.go:169: failed to list *v1.PolicyServer: policyservers.policies.kubewarden.io is forbidden: User "system:serviceaccount:kubewarden:policy-server" cannot list resource "policyservers" in API group "policies.kubewarden.io" at the cluster scope
E1230 13:18:09.768649       1 reflector.go:140] pkg/mod/k8s.io/client-go@v0.25.3/tools/cache/reflector.go:169: Failed to watch *v1.PolicyServer: failed to list *v1.PolicyServer: policyservers.policies.kubewarden.io is forbidden: User "system:serviceaccount:kubewarden:policy-server" cannot list resource "policyservers" in API group "policies.kubewarden.io" at the cluster scope
W1230 13:18:09.798225       1 reflector.go:424] pkg/mod/k8s.io/client-go@v0.25.3/tools/cache/reflector.go:169: failed to list *v1.Pod: pods is forbidden: User "system:serviceaccount:kubewarden:policy-server" cannot list resource "pods" in API group "" at the cluster scope
E1230 13:18:09.798258       1 reflector.go:140] pkg/mod/k8s.io/client-go@v0.25.3/tools/cache/reflector.go:169: Failed to watch *v1.Pod: failed to list *v1.Pod: pods is forbidden: User "system:serviceaccount:kubewarden:policy-server" cannot list resource "pods" in API group "" at the cluster scope

enhanced the permissions in chart values:

  policyServer:
    permissions:
      - apiGroup: ""
        resources:
          - namespaces
          - pods
          - secrets
          - services
      - apiGroup: "networking.k8s.io"
        resources:
          - ingresses
      - apiGroup: "policies.kubewarden.io"
        resources:
          - clusteradmissionpolicies
          - admissionpolicies
          - policyservers
        verbs:
          - get
          - list
          - watch
      - apiGroup: "policies.kubewarden.io"
        resources:
          - policyservers/status
        verbs:
          - update

Now the rbac errors are gone. ClusterRole/kubewarden-context-watcher is not really a watcher anymore with this wide permissions, but okay. We want to get this thing running.

Finally:

1.6724065504180388e+09  ERROR   Reconciler error    {"controller": "policyserver", "controllerGroup": "policies.kubewarden.io", "controllerKind": "PolicyServer", "PolicyServer": {"name":"default"}, "namespace": "", "name": "default", "reconcileID": "9caf4d17-be36-4773-954a-898421e83655", "error": "reconciliation error: error reconciling policy-server CA Secret: an empty namespace may not be set during creation"}
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler
    /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.13.0/pkg/internal/controller/controller.go:326
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
    /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.13.0/pkg/internal/controller/controller.go:273
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
    /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.13.0/pkg/internal/controller/controller.go:234

no ideas what this mean. A namespace is missing in policy server?

Expected Behavior

Policy Server up & running.

Steps To Reproduce

No response

Environment

- OS:
- Architecture:

Anything else?

No response

viccuad commented 1 year ago

Hi! thanks for opening an issue.

Tried to reproduce here with those exact versions, and I couldn't, everything works fine:

$ minikube start --kubernetes-version=v1.20.15

$ helm upgrade -i --wait \
    --namespace cert-manager \
    --create-namespace \
    --set installCRDs=true \
    cert-manager jetstack/cert-manager

$ helm upgrade -i --wait \
  --namespace kubewarden \
  --create-namespace \
  kubewarden-crds kubewarden/kubewarden-crds

$ helm upgrade -i --wait \
    --namespace kubewarden \
    --create-namespace \
    kubewarden-controller kubewarden/kubewarden-controller

$ helm upgrade -i --wait \
    --namespace kubewarden \
    --create-namespace \
    kubewarden-defaults kubewarden/kubewarden-defaults \
    --set recommendedPolicies.enabled=True \
    --set recommendedPolicies.defaultPolicyMode=monitor

$ helm ls -A
NAME                    NAMESPACE       REVISION        UPDATED                                 STATUS          CHART                           APP VERSION
cert-manager            cert-manager    1               2023-01-03 11:28:35.792683104 +0100 CET deployed        cert-manager-v1.10.1            v1.10.1
kubewarden-controller   kubewarden      1               2023-01-03 11:31:33.85552669 +0100 CET  deployed        kubewarden-controller-1.2.8     v1.4.0
kubewarden-crds         kubewarden      1               2023-01-03 11:31:11.91692011 +0100 CET  deployed        kubewarden-crds-1.2.3
kubewarden-defaults     kubewarden      1               2023-01-03 11:32:02.98314005 +0100 CET  deployed        kubewarden-defaults-1.2.8

$ kubectl get clusteradmissionpolicies
NAME                        POLICY SERVER   MUTATING   MODE      OBSERVED MODE   STATUS
do-not-run-as-root          default         true       monitor   monitor         active
do-not-share-host-paths     default         false      monitor   monitor         active
drop-capabilities           default         true       monitor   monitor         active
no-host-namespace-sharing   default         false      monitor   monitor         active
no-privilege-escalation     default         true       monitor   monitor         active
no-privileged-pod           default         false      monitor   monitor         active

Was cert-manager up and running before installing kubewarden-controller? did it have any hiccups?

Also, the "error reconciling policy-server CA Secret: an empty namespace may not be set during creation" is weird indeed. From looking at the code, it looked like kubewarden-controller couldn't create the secret in the namespace of the kubewarden-controller as it should, so again, lacking permissions. Has there been any other configuration done to the cluster? or specific RBACs?

eumel8 commented 1 year ago

Hello @viccuad,

thanks for pickup this issue. Yes, I started previously on a K3S cluster and everything works fine. Then I moved forward to our staging environment, managed by Rancher. This cluster is a RKE model with PodSecurityPolicy and ResourceQuotas. But this shouldn't be a problem in general. cert-manager was running fine, so far.

viccuad commented 1 year ago

@eumel8 deployed again on an RKE1 on vagrant, using the setup and cluster config here: https://github.com/viccuad/local-kubernetes-setup-with-rke-and-vagrant

I still can't reproduce :/. Do you have any specific pod-security-policy deployed? What version of RKE are you using? I'm happy helping, but I'm out of ideas without more info.

kravciak commented 1 year ago

hi @eumel8 - could you please describe in detail how you built this environment (including versions)? I tried to reproduce but everything worked fine:

At this point kubewarden won't create controller pod: Error creating: pods "kubewarden-controller-6cf5b4f56c-" is forbidden: PodSecurityPolicy: unable to admit pod

I think this is expected. Could you please provide step-by-step instructions how to reproduce your problem, including versions used?

eumel8 commented 1 year ago

okay,I tested the installation now on 3 clusters with the same result.

cluster 1: Kubernetes v1.20.15 cluster 2: Kubernetes v1.22.15 cluster 3: Kubernetes v1.23.12

Just to ensure it's not the Kubernetes version

used PSP

$ kubectl get psp restricted -o yaml
Warning: policy/v1beta1 PodSecurityPolicy is deprecated in v1.21+, unavailable in v1.25+
apiVersion: policy/v1beta1
kind: PodSecurityPolicy
metadata:
  annotations:
    kubectl.kubernetes.io/last-applied-configuration: |
      {"apiVersion":"policy/v1beta1","kind":"PodSecurityPolicy","metadata":{"annotations":{},"name":"restricted"},"spec":{"allowPrivilegeEscalation":false,"defaultAllowPrivilegeEscalation":false,"fsGroup":{"rule":"RunAsAny"},"privileged":false,"requiredDropCapabilities":["NET_RAW"],"runAsUser":{"rule":"MustRunAsNonRoot"},"seLinux":{"rule":"RunAsAny"},"supplementalGroups":{"rule":"RunAsAny"},"volumes":["emptyDir","secret","persistentVolumeClaim","downwardAPI","configMap","projected"]}}
  creationTimestamp: "2022-11-23T09:50:03Z"
  name: restricted
  resourceVersion: "2235677"
  uid: 994d046e-1530-4e4e-a76c-92ff22d7ef4f
spec:
  allowPrivilegeEscalation: false
  defaultAllowPrivilegeEscalation: false
  fsGroup:
    rule: RunAsAny
  requiredDropCapabilities:
  - NET_RAW
  runAsUser:
    rule: MustRunAsNonRoot
  seLinux:
    rule: RunAsAny
  supplementalGroups:
    rule: RunAsAny
  volumes:
  - emptyDir
  - secret
  - persistentVolumeClaim
  - downwardAPI
  - configMap
  - projected

Installation: Charts and Images are mirrored to own infrastructure Rancher project kubewarden and namespace kubewarden prepared on Rancher UI, PSP assigned.

Installing CRD chart:

$ helm -n kubewarden upgrade -i kubewarden-crds kubewarden-crds --version 1.2.3 --repo https://mcsps-charts.obs-website.eu-de.otc.t-systems.com/charts
Release "kubewarden-crds" does not exist. Installing it now.
NAME: kubewarden-crds
LAST DEPLOYED: Thu Jan  5 11:00:00 2023
NAMESPACE: kubewarden
STATUS: deployed
REVISION: 1
TEST SUITE: None

values controller:

common:
  cattle:
    systemDefaultRegistry: mtr.devops.telekom.de
  policyServer:
    default:
      name: default
      enabled: true
image:
  repository: "kubewarden/kubewarden-controller"
  tag: "v1.4.0"

preDeleteJob:
  image:
    repository: "mcsps/kubectl"
    tag: "latest"

tls:
  source: cert-manager-self-signed
  certManagerIssuerName: ""

resources:
  controller:
    limits:
      cpu: 750m
      memory: 500Mi
    requests:
      cpu: 250m
      memory: 50Mi

Install Controller:

$ helm -n kubewarden upgrade -i kubewarden-controller kubewarden-controller -f values-controller.yaml --version 1.2.8 --repo https://mcsps-charts.obs-website.eu-de.otc.t-systems.com/charts 
Release "kubewarden-controller" does not exist. Installing it now.
NAME: kubewarden-controller
LAST DEPLOYED: Thu Jan  5 11:01:25 2023
NAMESPACE: kubewarden
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
kubewarden-controller installed.

You can start defining admission policies by using the cluster-wide
`clusteradmissionpolicies.policies.kubewarden.io` or the namespaced
`admissionpolicies.policies.kubewarden.io` resources.

For more information check out https://kubewarden.io/

Values Policy Server:

common:
  cattle:
    systemDefaultRegistry: mtr.devops.telekom.de
policyServer:
  replicaCount: 1
  image:
    repository: "kubewarden/kubewarden-policy-server"
    tag: "v1.4.0"
  serviceAccountName: policy-server
  permissions:
    - apiGroup: ""
      resources:
        - namespaces
        - pods
        - secrets
        - services
    - apiGroup: "networking.k8s.io"
      resources:
        - ingresses
    - apiGroup: "policies.kubewarden.io"
      resources:
        - clusteradmissionpolicies
        - admissionpolicies
        - policyservers
      verbs:
        - get
        - list
        - watch
    - apiGroup: "policies.kubewarden.io"
      resources:
        - policyservers/status
      verbs:
        - update
  env:
    - name: KUBEWARDEN_LOG_LEVEL
      value: info
crdVersion: "policies.kubewarden.io/v1"
recommendedPolicies:
  enabled: False

Install Policy Server:

$ helm -n kubewarden upgrade -i kubewarden-defaults kubewarden-defaults -f values-defaults.yaml --version 1.2.8 --repo https://mcsps-charts.obs-website.eu-de.otc.t-systems.com/charts
Release "kubewarden-defaults" does not exist. Installing it now.
NAME: kubewarden-defaults
LAST DEPLOYED: Thu Jan  5 11:02:53 2023
NAMESPACE: kubewarden
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
kubewarden-defaults installed.

You now have a PolicyServer running in your cluster ready to run any
`clusteradmissionpolicies.policies.kubewarden.io` or
`admissionpolicies.policies.kubewarden.io` resources.

For more information on how to define policies, check out https://kubewarden.io/

Verify:

$ helm -n kubewarden list
NAME                    NAMESPACE   REVISION    UPDATED                                 STATUS      CHART                       APP VERSION
kubewarden-controller   kubewarden  1           2023-01-05 11:01:25.151029 +0100 CET    deployed    kubewarden-controller-1.2.8 v1.4.0     
kubewarden-crds         kubewarden  1           2023-01-05 11:00:00.210713 +0100 CET    deployed    kubewarden-crds-1.2.3                  
kubewarden-defaults     kubewarden  1           2023-01-05 11:02:53.554414 +0100 CET    deployed    kubewarden-defaults-1.2.8              
$ kubectl -n kubewarden get certificate
NAME                                 READY   SECRET                AGE
kubewarden-controller-serving-cert   True    webhook-server-cert   2m24s
$ kubectl -n kubewarden get pods       
NAME                                     READY   STATUS             RESTARTS   AGE
kubewarden-controller-78554b75cd-gj8l9   1/1     Running            0          2m45s
policy-server-default-5cb846897f-jk5vk   0/1     CrashLoopBackOff   3          77s
$ kubectl -n kubewarden logs policy-server-default-5cb846897f-jk5vk 
1.6729130255168474e+09  INFO    controller-runtime.metrics  Metrics server is starting to listen    {"addr": ":8088"}
1.6729130255171566e+09  INFO    controller-runtime.builder  Registering a mutating webhook  {"GVK": "policies.kubewarden.io/v1, Kind=PolicyServer", "path": "/mutate-policies-kubewarden-io-v1-policyserver"}
1.672913025517262e+09   INFO    controller-runtime.webhook  Registering webhook {"path": "/mutate-policies-kubewarden-io-v1-policyserver"}
1.672913025517328e+09   INFO    controller-runtime.builder  skip registering a validating webhook, object does not implement admission.Validator or WithValidator wasn't called {"GVK": "policies.kubewarden.io/v1, Kind=PolicyServer"}
1.672913025517386e+09   INFO    controller-runtime.builder  Registering a mutating webhook  {"GVK": "policies.kubewarden.io/v1, Kind=ClusterAdmissionPolicy", "path": "/mutate-policies-kubewarden-io-v1-clusteradmissionpolicy"}
1.6729130255174284e+09  INFO    controller-runtime.webhook  Registering webhook {"path": "/mutate-policies-kubewarden-io-v1-clusteradmissionpolicy"}
1.6729130255174627e+09  INFO    controller-runtime.builder  Registering a validating webhook    {"GVK": "policies.kubewarden.io/v1, Kind=ClusterAdmissionPolicy", "path": "/validate-policies-kubewarden-io-v1-clusteradmissionpolicy"}
1.672913025517506e+09   INFO    controller-runtime.webhook  Registering webhook {"path": "/validate-policies-kubewarden-io-v1-clusteradmissionpolicy"}
1.672913025517611e+09   INFO    controller-runtime.builder  Registering a mutating webhook  {"GVK": "policies.kubewarden.io/v1, Kind=AdmissionPolicy", "path": "/mutate-policies-kubewarden-io-v1-admissionpolicy"}
1.6729130255176942e+09  INFO    controller-runtime.webhook  Registering webhook {"path": "/mutate-policies-kubewarden-io-v1-admissionpolicy"}
1.672913025517791e+09   INFO    controller-runtime.builder  Registering a validating webhook    {"GVK": "policies.kubewarden.io/v1, Kind=AdmissionPolicy", "path": "/validate-policies-kubewarden-io-v1-admissionpolicy"}
1.6729130255178843e+09  INFO    controller-runtime.webhook  Registering webhook {"path": "/validate-policies-kubewarden-io-v1-admissionpolicy"}
1.6729130255182335e+09  INFO    setup   starting manager
1.672913025518507e+09   INFO    controller-runtime.webhook.webhooks Starting webhook server
1.6729130255185835e+09  INFO    Starting server {"path": "/metrics", "kind": "metrics", "addr": "[::]:8088"}
1.6729130255185947e+09  INFO    Starting server {"kind": "health probe", "addr": "[::]:8081"}
1.6729130256189635e+09  INFO    Stopping and waiting for non leader election runnables
1.6729130256190019e+09  INFO    Stopping and waiting for leader election runnables
1.672913025619062e+09   INFO    Starting EventSource    {"controller": "policyserver", "controllerGroup": "policies.kubewarden.io", "controllerKind": "PolicyServer", "source": "kind source: *v1.PolicyServer"}
1.6729130256190772e+09  INFO    Starting EventSource    {"controller": "clusteradmissionpolicy", "controllerGroup": "policies.kubewarden.io", "controllerKind": "ClusterAdmissionPolicy", "source": "kind source: *v1.ClusterAdmissionPolicy"}
1.6729130256190984e+09  INFO    Starting EventSource    {"controller": "policyserver", "controllerGroup": "policies.kubewarden.io", "controllerKind": "PolicyServer", "source": "kind source: *v1.AdmissionPolicy"}
1.6729130256191058e+09  INFO    Starting EventSource    {"controller": "policyserver", "controllerGroup": "policies.kubewarden.io", "controllerKind": "PolicyServer", "source": "kind source: *v1.ClusterAdmissionPolicy"}
1.6729130256191096e+09  INFO    Starting Controller {"controller": "policyserver", "controllerGroup": "policies.kubewarden.io", "controllerKind": "PolicyServer"}
1.6729130256191173e+09  INFO    Starting workers    {"controller": "policyserver", "controllerGroup": "policies.kubewarden.io", "controllerKind": "PolicyServer", "worker count": 1}
1.6729130256191237e+09  INFO    Shutdown signal received, waiting for all workers to finish {"controller": "policyserver", "controllerGroup": "policies.kubewarden.io", "controllerKind": "PolicyServer"}
1.6729130256191006e+09  INFO    Starting EventSource    {"controller": "clusteradmissionpolicy", "controllerGroup": "policies.kubewarden.io", "controllerKind": "ClusterAdmissionPolicy", "source": "kind source: *v1.Pod"}
1.6729130256191328e+09  INFO    Starting EventSource    {"controller": "clusteradmissionpolicy", "controllerGroup": "policies.kubewarden.io", "controllerKind": "ClusterAdmissionPolicy", "source": "kind source: *v1.PolicyServer"}
1.672913025619137e+09   INFO    Starting Controller {"controller": "clusteradmissionpolicy", "controllerGroup": "policies.kubewarden.io", "controllerKind": "ClusterAdmissionPolicy"}
1.6729130256191418e+09  INFO    Starting workers    {"controller": "clusteradmissionpolicy", "controllerGroup": "policies.kubewarden.io", "controllerKind": "ClusterAdmissionPolicy", "worker count": 1}
1.6729130256191454e+09  INFO    Shutdown signal received, waiting for all workers to finish {"controller": "clusteradmissionpolicy", "controllerGroup": "policies.kubewarden.io", "controllerKind": "ClusterAdmissionPolicy"}
1.672913025619149e+09   INFO    All workers finished    {"controller": "clusteradmissionpolicy", "controllerGroup": "policies.kubewarden.io", "controllerKind": "ClusterAdmissionPolicy"}
1.6729130256191561e+09  INFO    All workers finished    {"controller": "policyserver", "controllerGroup": "policies.kubewarden.io", "controllerKind": "PolicyServer"}
1.672913025619181e+09   INFO    Starting EventSource    {"controller": "admissionpolicy", "controllerGroup": "policies.kubewarden.io", "controllerKind": "AdmissionPolicy", "source": "kind source: *v1.AdmissionPolicy"}
1.6729130256192076e+09  INFO    Starting EventSource    {"controller": "admissionpolicy", "controllerGroup": "policies.kubewarden.io", "controllerKind": "AdmissionPolicy", "source": "kind source: *v1.Pod"}
1.6729130256192136e+09  INFO    Starting EventSource    {"controller": "admissionpolicy", "controllerGroup": "policies.kubewarden.io", "controllerKind": "AdmissionPolicy", "source": "kind source: *v1.PolicyServer"}
1.6729130256192462e+09  INFO    Starting Controller {"controller": "admissionpolicy", "controllerGroup": "policies.kubewarden.io", "controllerKind": "AdmissionPolicy"}
1.672913025619254e+09   INFO    Starting workers    {"controller": "admissionpolicy", "controllerGroup": "policies.kubewarden.io", "controllerKind": "AdmissionPolicy", "worker count": 1}
1.6729130256192594e+09  INFO    Shutdown signal received, waiting for all workers to finish {"controller": "admissionpolicy", "controllerGroup": "policies.kubewarden.io", "controllerKind": "AdmissionPolicy"}
1.6729130256192765e+09  INFO    All workers finished    {"controller": "admissionpolicy", "controllerGroup": "policies.kubewarden.io", "controllerKind": "AdmissionPolicy"}
1.6729130256192827e+09  INFO    Stopping and waiting for caches
1.6729130256193933e+09  INFO    Stopping and waiting for webhooks
1.6729130256194034e+09  INFO    Wait completed, proceeding to shutdown the manager
1.67291302561941e+09    ERROR   setup   problem running manager {"error": "open /tmp/k8s-webhook-server/serving-certs/tls.crt: no such file or directory"}
main.main
    /workspace/main.go:205
runtime.main
    /usr/local/go/src/runtime/proc.go:250
1.6729130256194332e+09  ERROR   controller-runtime.source   failed to get informer from cache   {"error": "Timeout: failed waiting for *v1.PolicyServer Informer to sync"}
sigs.k8s.io/controller-runtime/pkg/source.(*Kind).Start.func1.1
    /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.13.0/pkg/source/source.go:144
k8s.io/apimachinery/pkg/util/wait.runConditionWithCrashProtectionWithContext
    /go/pkg/mod/k8s.io/apimachinery@v0.25.3/pkg/util/wait/wait.go:235
k8s.io/apimachinery/pkg/util/wait.poll
    /go/pkg/mod/k8s.io/apimachinery@v0.25.3/pkg/util/wait/wait.go:582
k8s.io/apimachinery/pkg/util/wait.PollImmediateUntilWithContext
    /go/pkg/mod/k8s.io/apimachinery@v0.25.3/pkg/util/wait/wait.go:547
sigs.k8s.io/controller-runtime/pkg/source.(*Kind).Start.func1
    /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.13.0/pkg/source/source.go:132
viccuad commented 1 year ago

Thanks! Will have a detailed look.

As an easy shot: Are you waiting enough (and checking that deployments are ready) between deploying the helm charts? Does deleting the policy-server-default pod help? I just noticed that you are doing helm upgrade -i without the --wait. If not manually waiting and checking the resources for being ready, kubewarden-controller will not have had time to scaffold and create the self-signed cert needed, before the PolicyServer default from kubewarden-default starts to go up.

Does the kubewarden-controller deployment/pod, or cert-manager have any errors? (one or the other may have failed to create the cert, hence why the policy-server didn't find it to mount)

eumel8 commented 1 year ago

Yes, this problem occured on parallel Fleet installation of both charts. Controller is not ready while Policy Server installing. But in manual installation there is enough time to wait until the controller is ready.

controller log:

1.6729245254905996e+09  INFO    controller-runtime.metrics  Metrics server is starting to listen    {"addr": ":8088"}
1.6729245254908457e+09  INFO    controller-runtime.builder  Registering a mutating webhook  {"GVK": "policies.kubewarden.io/v1, Kind=PolicyServer", "path": "/mutate-policies-kubewarden-io-v1-policyserver"}
1.6729245254909337e+09  INFO    controller-runtime.webhook  Registering webhook {"path": "/mutate-policies-kubewarden-io-v1-policyserver"}
1.6729245254909887e+09  INFO    controller-runtime.builder  skip registering a validating webhook, object does not implement admission.Validator or WithValidator wasn't called {"GVK": "policies.kubewarden.io/v1, Kind=PolicyServer"}
1.6729245254910436e+09  INFO    controller-runtime.builder  Registering a mutating webhook  {"GVK": "policies.kubewarden.io/v1, Kind=ClusterAdmissionPolicy", "path": "/mutate-policies-kubewarden-io-v1-clusteradmissionpolicy"}
1.672924525491104e+09   INFO    controller-runtime.webhook  Registering webhook {"path": "/mutate-policies-kubewarden-io-v1-clusteradmissionpolicy"}
1.6729245254911432e+09  INFO    controller-runtime.builder  Registering a validating webhook    {"GVK": "policies.kubewarden.io/v1, Kind=ClusterAdmissionPolicy", "path": "/validate-policies-kubewarden-io-v1-clusteradmissionpolicy"}
1.6729245254911854e+09  INFO    controller-runtime.webhook  Registering webhook {"path": "/validate-policies-kubewarden-io-v1-clusteradmissionpolicy"}
1.6729245254913168e+09  INFO    controller-runtime.builder  Registering a mutating webhook  {"GVK": "policies.kubewarden.io/v1, Kind=AdmissionPolicy", "path": "/mutate-policies-kubewarden-io-v1-admissionpolicy"}
1.6729245254913661e+09  INFO    controller-runtime.webhook  Registering webhook {"path": "/mutate-policies-kubewarden-io-v1-admissionpolicy"}
1.672924525491405e+09   INFO    controller-runtime.builder  Registering a validating webhook    {"GVK": "policies.kubewarden.io/v1, Kind=AdmissionPolicy", "path": "/validate-policies-kubewarden-io-v1-admissionpolicy"}
1.6729245254914496e+09  INFO    controller-runtime.webhook  Registering webhook {"path": "/validate-policies-kubewarden-io-v1-admissionpolicy"}
1.6729245254917974e+09  INFO    setup   starting manager
1.6729245254920595e+09  INFO    controller-runtime.webhook.webhooks Starting webhook server
1.6729245254922256e+09  INFO    Starting server {"path": "/metrics", "kind": "metrics", "addr": "[::]:8088"}
1.672924525492257e+09   INFO    Starting server {"kind": "health probe", "addr": "[::]:8081"}
1.6729245254922812e+09  INFO    controller-runtime.certwatcher  Updated current TLS certificate
1.6729245254923785e+09  INFO    controller-runtime.webhook  Serving webhook server  {"host": "", "port": 9443}
1.6729245254924843e+09  INFO    controller-runtime.certwatcher  Starting certificate watcher
I0105 13:15:25.593016       1 leaderelection.go:248] attempting to acquire leader lease kubewarden/a4ddbf36.kubewarden.io...
I0105 13:15:44.497043       1 leaderelection.go:258] successfully acquired lease kubewarden/a4ddbf36.kubewarden.io
1.6729245444970973e+09  DEBUG   events  kubewarden-controller-78554b75cd-zh99j_f769c125-e4e0-46f2-aa7e-14bb63ad3c50 became leader   {"type": "Normal", "object": {"kind":"Lease","namespace":"kubewarden","name":"a4ddbf36.kubewarden.io","uid":"222866d3-b650-4249-aa28-d5e730a106ee","apiVersion":"coordination.k8s.io/v1","resourceVersion":"25133221"}, "reason": "LeaderElection"}
1.672924544497277e+09   INFO    Starting EventSource    {"controller": "policyserver", "controllerGroup": "policies.kubewarden.io", "controllerKind": "PolicyServer", "source": "kind source: *v1.PolicyServer"}
1.672924544497334e+09   INFO    Starting EventSource    {"controller": "admissionpolicy", "controllerGroup": "policies.kubewarden.io", "controllerKind": "AdmissionPolicy", "source": "kind source: *v1.AdmissionPolicy"}
1.6729245444973626e+09  INFO    Starting EventSource    {"controller": "policyserver", "controllerGroup": "policies.kubewarden.io", "controllerKind": "PolicyServer", "source": "kind source: *v1.AdmissionPolicy"}
1.6729245444973724e+09  INFO    Starting EventSource    {"controller": "admissionpolicy", "controllerGroup": "policies.kubewarden.io", "controllerKind": "AdmissionPolicy", "source": "kind source: *v1.Pod"}
1.6729245444973826e+09  INFO    Starting EventSource    {"controller": "admissionpolicy", "controllerGroup": "policies.kubewarden.io", "controllerKind": "AdmissionPolicy", "source": "kind source: *v1.PolicyServer"}
1.6729245444973884e+09  INFO    Starting Controller {"controller": "admissionpolicy", "controllerGroup": "policies.kubewarden.io", "controllerKind": "AdmissionPolicy"}
1.672924544497288e+09   INFO    Starting EventSource    {"controller": "clusteradmissionpolicy", "controllerGroup": "policies.kubewarden.io", "controllerKind": "ClusterAdmissionPolicy", "source": "kind source: *v1.ClusterAdmissionPolicy"}
1.672924544497428e+09   INFO    Starting EventSource    {"controller": "clusteradmissionpolicy", "controllerGroup": "policies.kubewarden.io", "controllerKind": "ClusterAdmissionPolicy", "source": "kind source: *v1.Pod"}
1.672924544497434e+09   INFO    Starting EventSource    {"controller": "clusteradmissionpolicy", "controllerGroup": "policies.kubewarden.io", "controllerKind": "ClusterAdmissionPolicy", "source": "kind source: *v1.PolicyServer"}
1.6729245444974391e+09  INFO    Starting Controller {"controller": "clusteradmissionpolicy", "controllerGroup": "policies.kubewarden.io", "controllerKind": "ClusterAdmissionPolicy"}
1.672924544497375e+09   INFO    Starting EventSource    {"controller": "policyserver", "controllerGroup": "policies.kubewarden.io", "controllerKind": "PolicyServer", "source": "kind source: *v1.ClusterAdmissionPolicy"}
1.6729245444974608e+09  INFO    Starting Controller {"controller": "policyserver", "controllerGroup": "policies.kubewarden.io", "controllerKind": "PolicyServer"}
1.672924544598411e+09   INFO    Starting workers    {"controller": "policyserver", "controllerGroup": "policies.kubewarden.io", "controllerKind": "PolicyServer", "worker count": 1}
1.672924544598412e+09   INFO    Starting workers    {"controller": "clusteradmissionpolicy", "controllerGroup": "policies.kubewarden.io", "controllerKind": "ClusterAdmissionPolicy", "worker count": 1}
1.6729245445984268e+09  INFO    Starting workers    {"controller": "admissionpolicy", "controllerGroup": "policies.kubewarden.io", "controllerKind": "AdmissionPolicy", "worker count": 1}

cert-manager:

I0105 13:15:17.640538       1 controller.go:178] cert-manager/certificate/mutatingwebhookconfiguration/generic-inject-reconciler "msg"="updated object" "resource_kind"="MutatingWebhookConfiguration" "resource_name"="kubewarden-controller-mutating-webhook-configuration" "resource_namespace"="" "resource_version"="v1"
I0105 13:15:17.646121       1 controller.go:178] cert-manager/certificate/mutatingwebhookconfiguration/generic-inject-reconciler "msg"="updated object" "resource_kind"="MutatingWebhookConfiguration" "resource_name"="kubewarden-controller-mutating-webhook-configuration" "resource_namespace"="" "resource_version"="v1"
I0105 13:15:17.677582       1 controller.go:178] cert-manager/certificate/validatingwebhookconfiguration/generic-inject-reconciler "msg"="updated object" "resource_kind"="ValidatingWebhookConfiguration" "resource_name"="kubewarden-controller-validating-webhook-configuration" "resource_namespace"="" "resource_version"="v1"
I0105 13:15:17.682753       1 controller.go:178] cert-manager/certificate/validatingwebhookconfiguration/generic-inject-reconciler "msg"="updated object" "resource_kind"="ValidatingWebhookConfiguration" "resource_name"="kubewarden-controller-validating-webhook-configuration" "resource_namespace"="" "resource_version"="v1"
I0105 13:15:17.543079       1 conditions.go:203] Setting lastTransitionTime for Certificate "kubewarden-controller-serving-cert" condition "Ready" to 2023-01-05 13:15:17.543066399 +0000 UTC m=+49.900099790
I0105 13:15:17.589430       1 conditions.go:96] Setting lastTransitionTime for Issuer "kubewarden-controller-selfsigned-issuer" condition "Ready" to 2023-01-05 13:15:17.589420762 +0000 UTC m=+49.946454160

controller after deploy Policy server:

1.6729250028985379e+09  DEBUG   controller-runtime.webhook.webhooks received request    {"webhook": "/mutate-policies-kubewarden-io-v1-policyserver", "UID": "d263197b-87df-4f76-aad7-b0f7329e0a45", "kind": "policies.kubewarden.io/v1, Kind=PolicyServer", "resource": {"group":"policies.kubewarden.io","version":"v1","resource":"policyservers"}}
1.672925002898703e+09   INFO    policyserver-resource   default {"name": "default"}
1.6729250028991623e+09  DEBUG   controller-runtime.webhook.webhooks wrote response  {"webhook": "/mutate-policies-kubewarden-io-v1-policyserver", "code": 200, "reason": "", "UID": "d263197b-87df-4f76-aad7-b0f7329e0a45", "allowed": true}

error logs of POD:

$ kubectl -n kubewarden logs policy-server-default-6b4949fc56-cfrng|grep ERROR
1.6729253705876324e+09  ERROR   controller-runtime.source   failed to get informer from cache   {"error": "Timeout: failed waiting for *v1.PolicyServer Informer to sync"}
1.6729253705876398e+09  ERROR   controller-runtime.source   failed to get informer from cache   {"error": "Timeout: failed waiting for *v1.PolicyServer Informer to sync"}
1.672925370587686e+09   ERROR   setup   problem running manager {"error": "open /tmp/k8s-webhook-server/serving-certs/tls.crt: no such file or directory"}

secrets/configmaps:

kubectl -n kubewarden get cm                                                
NAME                                   DATA   AGE
kube-root-ca.crt                       1      6d
kubewarden-controller-manager-config   1      15m
policy-server-default                  2      7m24s
kubectl -n kubewarden get secrets 
NAME                                          TYPE                                  DATA   AGE
default-token-lqkw8                           kubernetes.io/service-account-token   3      6d1h
kubewarden-controller-token-v9wqp             kubernetes.io/service-account-token   3      15m
policy-server-default                         Opaque                                2      7m42s
policy-server-root-ca                         Opaque                                3      6d
policy-server-token-xzlbx                     kubernetes.io/service-account-token   3      7m44s
sh.helm.release.v1.kubewarden-controller.v1   helm.sh/release.v1                    1      15m
sh.helm.release.v1.kubewarden-crds.v1         helm.sh/release.v1                    1      3h31m
sh.helm.release.v1.kubewarden-defaults.v1     helm.sh/release.v1                    1      7m44s
webhook-server-cert                           kubernetes.io/tls                     3      6d
viccuad commented 1 year ago

I see.

The Kubewarden charts themselves don't codify any Fleet behavior, AFAIK there's no way to do that.

One can use dependsOn in fleet.yaml:

# will only be deployed, after all dependencies are deployed and in a Ready state.
dependsOn:

https://fleet.rancher.io/gitrepo-structure#fleetyaml

I have tested it successfully on Rancher 2.7 & Fleet, with this example repo: https://github.com/viccuad/kubewarden-fleet. Note the README.md for the GitRepo name, and what dependsOn expects.

On Fleet, you may see transient errors such as: ErrApplied(1) [Cluster fleet-local/local: dependent bundle(s) are not ready: [kubewarden-example-helm-kubewarden-controller]], but once each chart is correctly deployed, everything will be sane and ready.

In the case of not using Fleet but installing Kubewarden via Rancher catalog, one can add https://charts.kubewarden.io as helm repo to the catalog, and then install the "Kubewarden" Application. This workflow currently has support for defined hard dependencies between the charts, supported Rancher versions, etc, via the Helm catalog.cattle.io/* annotations in charts (see here), so everything will work fine, and the Rancher catalog will install the Kubewarden charts in the correct order and when they dependencies are ready.

viccuad commented 1 year ago

Repurposing this issue to track some docs WRT to Fleet configuration :).

eumel8 commented 1 year ago

@viccuad I tried this on 3 clusters but kubewarden is not installed. The clusters are active in Fleet, the GitRepo is successful deployed, the created bundle is active and green. Only the cluster is not ready for the GitRepo.

I tried also the catalog installation. Controller and CRDs are installed, policy server not, but it's enabled in values file.

viccuad commented 1 year ago

@viccuad I tried this on 3 clusters but kubewarden is not installed. The clusters are active in Fleet, the GitRepo is successful deployed, the created bundle is active and green. Only the cluster is not ready for the GitRepo.

Here everything works fine. Are you sure you have added the GitRepo to the correct Fleet Workspace, so it applies to your specific cluster, and correcly set the repo path? (Example, I only have 1 cluster, a local one, hence under I added the GitRepo under the fleet-local workspace)

I just realized that deleting the GitRepo means that the 3 kubewarden charts get removed at once. This makes the pre-delete helm hook job that deletes the default policy-server to fail, as the CRDs have been removed. This pre-delete hook is needed because we need to vacate the webhooks of the policies (this is true any Policy Engine) before deleting the PolicyServer, or the cluster will have webhooks for policies, but no policies being run, and will reject everything and be in a failed state.

Note that uninstalling CRDs automatically is normally not supported in any tooling, because situations as this one. And Rancher Fleet is no exception, when removing blindly the GitRepo. This could be helped if Rancher Fleet provides a way to define dependencies that is honoured also on chart uninstall (but this should be taken on with Rancher Fleet, we can't do much here, Kubewarden is a CNCF project).

If you want to perform a correct removal, make sure to remove first the Bundle for kubewarden-defaults from the cluster, then kubewarden-controller, and last, kubewarden-crds.

I tried also the catalog installation. Controller and CRDs are installed, policy server not, but it's enabled in values file.

This is true. The upcoming Kubewarden UI marks the kubewarden-defaults chart as hidden. The upcoming Kubewarden UI will be installable as a Rancher Extension and solves this, in addition to providing a better first install experience and more features.

But note that if Fleet left the cluster in a bad state because it failed when uninstalling charts (see previous paragraph), then one needs to do the cleanup manually, so the catalog installs kubewarden-controller succesfully, and proceeds to kubewarden-defaults (that instantiates the PolicyServer).

viccuad commented 1 year ago

@eumel8 I have added the current page to docs: https://docs.kubewarden.io/operator-manual/Rancher-Fleet

I hope the suggested workarounds there help! I wonder what else can be done from the side of Kubewarden, to be honest I think we can't do much else.

eumel8 commented 1 year ago

@viccuad thx, will try it

eumel8 commented 1 year ago

Hello @viccuad,

some progress today. The first error was: I had no cluster in cluster groups. In K3s / Rancher 2.7 is a Cluster Group default with a matchLabel:

spec:
  selector:
    matchLabels:
      name: local

But the local Cluster has no label name: default. I've added them and then the installation was sucessfulin Fleet.

I repeated them on Downstream cluster with restricted NetworkPolicy, which works also beside the cleanup job.

To solve my previous installation problem it must have caused in the Helm chart values. Can you provide in your example how to provide values? I think one example is enough. thx!

viccuad commented 1 year ago

Happy to hear that is working!

@eumel8 I pushed a new commit to the example that is now in Kubewarden docs, https://github.com/kubewarden/fleet-example. This commit https://github.com/kubewarden/fleet-example/commit/cd771b8533821bb15dc228748c766d5adc15d308 sets values and enables the recommended policies in monitor mode. There's several ways to do it according to Rancher Fleet docs, I selected one.

Tested it here and it works.

Closing this issue, and thanks again!

eumel8 commented 1 year ago

Just to sum up and come back later: The crash of the Policy Server caused by a wrong mirrored image. I used the controller image instead of policy server image. Fixed it and the server starts.

The Fleet Example worked so far, also with our fork. The problem if I delete GitRepo for uninstall, CRD chart is deleted first, and the pre-delete-hook can't find the resource policyservers.policies.kubewarden.io. The Controller chart is at the end un-installable. This job lacks on the possibility to set SecurityContext and can't run on our hardened environment. Also the deployment has a hardcoded value, which is not enough to run in secure context. Open a new issue for that. Alone restricted PodSecurityPolicy works, beside the cleanup job. This should be removed. If I delete the CRD, the resources are already gone.

jvanz commented 1 year ago

Thank for your feedback @eumel8! I could simulate the issue that you reported in the Kubewarden removal when using fleet. I've opened an issue to address that.