Open KosShutenko opened 9 months ago
@KosShutenko - have you found a workaround for this? We upgraded EKS to 1.29 yesterday, and we are seeing the exact same errors.
Actually, here's a little more info - we were already running EKS 1.29, and jenkins was working. But we updated our nodes from Amazon Linux 2 to Amazon Linux 2023, and we are getting the same error you documented.
I would raise this with the kubernetes-plugin, it doesn't look related to the helm chart
@KosShutenko - have you found a workaround for this? We upgraded EKS to 1.29 yesterday, and we are seeing the exact same errors.
Actually, here's a little more info - we were already running EKS 1.29, and jenkins was working. But we updated our nodes from Amazon Linux 2 to Amazon Linux 2023, and we are getting the same error you documented.
Quick update on this -- it turns out that it was a new cluster component we had added recently -- the vertical pod autoscaler (https://github.com/kubernetes/autoscaler/tree/master/vertical-pod-autoscaler).
The VPA installs an admission controller webhook. It was that webhook that was timing out, causing the pod creation API call to timeout.
@KosShutenko - your problem may not be the VPA, but you might want to look at all your mutating webhooks:
kubectl get MutatingWebhookConfiguration -A
I would also suggest you look at your kubernetes API logs in cloudwatch for more clues. In my case, I found log entries like this:
Failed calling webhook, failing open vpa.k8s.io: failed calling webhook "vpa.k8s.io": failed to call webhook: Post "
[https://vpa-webhook.vpa.svc:443/?timeout=30s":](https://vpa-webhook.vpa.svc/?timeout=30s%22:)
net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
that led me to identify VPA as the culprit.
Describe the bug
I have EKS cluster with 1.29 version. I've installed Jenkins helm chart (latest version) via FluxCD. From values I've changed Ingress only. After Jenkins installation I've tested Kubernetes connections and Its OK.
But test-job with default (proposed by Jenkins) pipeline cannot be executed. I don't see any pods with agent started in Jenkins namespace.
In console output I see:
In jenkins-0 pod (jenkins container) I see the following logs:
Version of Helm and Kubernetes
Chart version
jenkins-5.0.13
What happened?
What you expected to happen?
I have another Jenkins installations on 1.26-1.27 GKE clusters and it works fine. Jenkins creates agents pods and exec pipelines.
How to reproduce it
Anything else we need to know?
No response