Closed mueller-ma closed 1 month ago
Hi @mueller-ma ! Thanks for the report!
I'm trying to simulate your issue using a RKE2 v1.28.10+rke2r1
:
/var/lib/rancher/rke2/bin/kubectl --kubeconfig ./rke2-kubeconfig.yaml version
Client Version: v1.28.10+rke2r1
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
Server Version: v1.28.10+rke2r1
Unfortunately, I was not able to simulate the issue:
helm --kubeconfig ./rke2-kubeconfig.yaml upgrade --install --wait -n kubewarden kubewarden-defaults kubewarden/kubewarden-defaults
Release "kubewarden-defaults" has been upgraded. Happy Helming!
NAME: kubewarden-defaults
LAST DEPLOYED: Mon Jun 3 16:51:10 2024
NAMESPACE: kubewarden
STATUS: deployed
REVISION: 2
TEST SUITE: None
NOTES:
You now have a `PolicyServer` named `default` running in your cluster.
It is ready to run any `clusteradmissionpolicies.policies.kubewarden.io` or
`admissionpolicies.policies.kubewarden.io` resources.
For more information check out https://docs.kubewarden.io/quick-start.
Discover ready to use policies at https://artifacthub.io/packages/search?kind=13.
/var/lib/rancher/rke2/bin/kubectl --kubeconfig ./rke2-kubeconfig.yaml get pods -n kubewarden
NAME READY STATUS RESTARTS AGE
kubewarden-controller-577857d487-8lqh7 1/1 Running 0 9m37s
policy-server-default-559f5b45fd-qxn44 1/1 Running 0 8m35s
I have some question that I would like to ask you to understand better your situation. In the commands that you've shared with us, I can see that you use the upgrade --install
commands and the successful helm commands show the revision of the installation as 2
. Witch tells me that you were trying to install by the second time, is that correct? I've tried to reinstall the kubewarden-defaults
as well. But everything goes fine so far
Furthermore, I notice that your issuers and certificates are 17h old. Assuming that you ran that commands right after the failed installation, could be those certificates and issuer be a left over an previous installation and is making the current installation fail? Can you also confirm your cert-manager version?
I did the initial installation and got the certificate error. Then I run the update commands 17h later to be able to copy the error to this issue. I already tried to reinstall Kubewarden from scratch, but that didn't help.
cert-manager
is at version v1.14.5.
In the meantime the Kubernetes cluster got updated to v1.28.10+rke2r1, but the error is the same. I tried a fresh installation:
$ helm install --wait -n kubewarden kubewarden-defaults kubewarden/kubewarden-defaults
Error: INSTALLATION FAILED: 1 error occurred:
* Internal error occurred: failed calling webhook "mpolicyserver.kb.io": failed to call webhook: Post "https://kubewarden-controller-webhook-service.kubewarden.svc:443/mutate-policies-kubewarden-io-v1-policyserver?timeout=10s": tls: failed to verify certificate: x509: certificate signed by unknown authority
Thanks for the feedback @mueller-ma!
Let me share what I have in mind and ask you for some more info. Considering the error message that you shared with us. It looks like the during the installation of the kubewarden-defaults Helm chart the API server send a request to the controller webhook (mpolicyserver.kb.io
) to validate/mutate the policy server resource that should be applied in the cluster. But it fails for some reason. Therefore, I would like to check two main things: 1. cert-manager properly configured the certificates in the webhooks configuration; 2. Is there some configuration in the RKE2 cluster that is messing up with things.
For that, let's get some data to start thinking more about that. Please, share if us the following commands output:
kubectl get secrets -n kubewarden webhook-server-cert -o yaml
kubectl get validatingwebhookconfigurations kubewarden-controller-validating-webhook-configuration -o yaml
kubectl get mutatingwebhookconfigurations kubewarden-controller-mutating-webhook-configuration -o yaml
kubectl get pods -n kube-system -o yaml <api server pod name>
Furthermore, can you try to reinstall and collect the logs from the API server and Kubewarden controller. I would like to have more context of what's going on during the installation.
Another question, do you have some customization in the RKE2 installation? Do you have set any option in the config file or command line during installation? I'm asking this because I would like to have an environment closest of yours.
I can share some insights about this cluster:
apiVersion: helm.cattle.io/v1
kind: HelmChartConfig
metadata:
name: rke2-ingress-nginx
namespace: kube-system
spec:
valuesContent: |-
controller:
config:
force-ssl-redirect: 'true'
hsts-max-age: '31536000'
ssl-protocols: TLSv1.2 TLSv1.3
extraArgs:
default-ssl-certificate: cert-manager/default-cert # LE wildcard certificate in the cert-manager namespace. It's applied to all ingress without a `tls` block.
# cat /etc/rancher/rke2/config.yaml
server: https://<redacted>:9345
token: <redacted>
data-dir: /var/lib/rancher/rke2
cni: canal
tls-san:
- cluster.local
- <redacted>
snapshotter: overlayfs
node-name: <redacted>
Here's the output of the commands you requested. I replace the name of the node by <node-name>
.
Logs from the kube-apiserver-... pod when I try to install kubewarden-defaults
again:
W0605 07:53:49.258539 1 dispatcher.go:225] Failed calling webhook, failing closed mpolicyserver.kb.io: failed calling webhook "mpolicyserver.kb.io": failed to call webhook: Post "https://kubewarden-controller-webhook-service.kubewarden.svc:443/mutate-policies-kubewarden-io-v1-policyserver?timeout=10s": tls: failed to verify certificate: x509: certificate signed by unknown authority
and from pods/kubewarden-controller-....:
2024/06/05 07:57:54 http: TLS handshake error from 10.42.1.0:43924: remote error: tls: bad certificate
In both cases it's only one line without much information :/
Thanks @mueller-ma !
Hmm, interesting... it seems that cert-manager is not injecting the caBundle in your webhooks configuration. In the your webhooks, there is the annotation cert-manager.io/inject-ca-from: kubewarden/kubewarden-controller-serving-cert
but the caBundle
is missing. Take a look in a example from my cluster:
apiVersion: admissionregistration.k8s.io/v1
kind: MutatingWebhookConfiguration
metadata:
annotations:
cert-manager.io/inject-ca-from: kubewarden/kubewarden-controller-serving-cert
meta.helm.sh/release-name: kubewarden-controller
meta.helm.sh/release-namespace: kubewarden
creationTimestamp: "2024-06-04T21:29:10Z"
generation: 2
labels:
app.kubernetes.io/component: controller
app.kubernetes.io/instance: kubewarden-controller
app.kubernetes.io/managed-by: Helm
app.kubernetes.io/name: kubewarden-controller
app.kubernetes.io/part-of: kubewarden
app.kubernetes.io/version: v1.12.0
helm.sh/chart: kubewarden-controller-2.0.11
name: kubewarden-controller-mutating-webhook-configuration
resourceVersion: "134591"
uid: d621ef1a-c0c3-4a7c-a492-a909a3f7ae21
webhooks:
[...]
- admissionReviewVersions:
- v1
- v1beta1
clientConfig:
caBundle: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSUROakNDQWg2Z0F3SUJBZ0lSQUlzcWNHbGI5cW9xT3A0V3cyaTZWWW93RFFZSktvWklodmNOQVFFTEJRQXcKQURBZUZ3MHlOREEyTURNeE9UUXlNek5hRncweU5EQTVNREV4T1RReU16TmFNQUF3Z2dFaU1BMEdDU3FHU0liMwpEUUVCQVFVQUE0SUJEd0F3Z2dFS0FvSUJBUUNydmdZSTc4bnMzWE1kclc1bU5mSldReEpMKzRlcnlMcUNISm5mCjBUNHVpVk9nNVdQZnpHWEZWWGxBcUZDUytPM0ZZR3NYSVAwN3JRM0pWMlJLaGE1dTVHT0hzN1lEY3ZoajVLVGwKbmE3WGo0YjFLakdWRlUzZzY1VGJZV1dWek50QkxTR29vSDV0UTVYVDdoNFUwZzR3VmV3RldCRFNUMlhNa0NLaApDYVV0UzRYVWtoTWhCbUNyQ0I0K3lXcngyckN3bGc5SE40UjdZekdOK3ZFTnVsaXUwcEdJSHdKUUdDUVM0R1lOCjl6VVN5VDBHSVVXb2cxWU1GTk13NTZGVHp1S3JlMUg4dkVma1ZzcTJTU2tNWEs0WDlHOU1ub0JQVHlqelRHeloKV2UxdDFRQ1lvR2Uvc0FKKzNaWmZxR2x5eDIzRjNxY1FibnJ6bHRHQWQ2RkxjWUhYQWdNQkFBR2pnYW93Z2FjdwpEZ1lEVlIwUEFRSC9CQVFEQWdXZ01Bd0dBMVVkRXdFQi93UUNNQUF3Z1lZR0ExVWRFUUVCL3dSOE1IcUNOR3QxClltVjNZWEprWlc0dFkyOXVkSEp2Ykd4bGNpMTNaV0pvYjI5ckxYTmxjblpwWTJVdWEzVmlaWGRoY21SbGJpNXoKZG1PQ1FtdDFZbVYzWVhKa1pXNHRZMjl1ZEhKdmJHeGxjaTEzWldKb2IyOXJMWE5sY25acFkyVXVhM1ZpWlhkaApjbVJsYmk1emRtTXVZMngxYzNSbGNpNXNiMk5oYkRBTkJna3Foa2lHOXcwQkFRc0ZBQU9DQVFFQVVVdjMzYTNwCjM0ZkxUZzB2L05lS1l2RUxhL2hjM0xlVkdZZG1qTjRmcnlnb2p0SVRrYlJnOS8rSlFrbUl0Tmh3UXBDOEpKbmUKSkpkRTU0RmZqampuWFU5bThwWHRDNUhsY1kxNU1LYVhScG51bHNVdm5VSmRBdG9ManFTUjVZUjZ1ZDZDaGhsbQowYlJzUG9nVXlGRTdZV3hRUEh3WjV4RENsTGZ0cEoxUTY2VEIyaUwrcHg2akR3Yi9yeHBYeWI4aHFIRmNpM1huCm5ZZlJkVnR4S1pqMEVCaXpyc3E4Q1g1a2IzK0toYVFmMnlDd3NIdVdqSXZPUDVVek9PSFVlcURneERHRzhNYkEKa0E2VTEzLzFyanE0ck9aTm8yTUkyUWM0bjNNdnNoY052TTVoWHF0aG1YMmdMcHROT3FNZDVKVldXcU4ycEpHdgp4K1FURHJRREx1NzdzZz09Ci0tLS0tRU5EIENFUlRJRklDQVRFLS0tLS0K
service:
name: kubewarden-controller-webhook-service
namespace: kubewarden
path: /mutate-policies-kubewarden-io-v1-policyserver
port: 443
Note that the certificate should be in the kubewarden
namespace. But in the issue description I cannot see the namespace of the certificate, can you double check that?
Another question, are the cainjector enable in your cert-manager installation? This could be disable in the helm chart installation. Maybe, you can share the values used in the cert-manager installation. Therefore, I can replicate it here. I'm considering that you are installing the cert-manager using Helm commands, am I right? Or are you using the Helm CRDs available in the RKE2?
If you have the cainjector running in the cluster, can you see any error in its logs ? Maybe share the logs here as well.
I found the issue: I applied the best practices for cert-manager which includes: https://cert-manager.io/docs/installation/best-practice/#memory
cainjector:
extraArgs:
- --namespace=cert-manager
- --enable-certificates-data-source=false
Remove the extraArgs
fixed the missing caBundle
. The installation works now, thank you for your help :)
Is there an existing issue for this?
Current Behavior
I tried the commands from https://docs.kubewarden.io/quick-start to install Kubewarden, but the deployment of kubewarden-defaults fails. I had already cert-manager installed and as you can see at the end of the output, both issuer and certificate have been created:
Expected Behavior
Installation works
Steps To Reproduce
Environment
Anything else?
No response