Open GhangZh opened 8 months ago
Actually it's not a bug. You should check whether webhook is running and check connection between kube-apiserver and webhook pod.
Actually it's not a bug. You should check whether webhook is running and check connection between kube-apiserver and webhook pod.
I checked that volcano-admission is running and only a few errors were reported
Actually it's not a bug. You should check whether webhook is running and check connection between kube-apiserver and webhook pod.
I checked that volcano-admission is running and only a few errors were reported
Seems that server didn't run successfully,you should check tls certificate signed right.
Seems that server didn't run successfully,you should check tls certificate signed right. I got the same errors in the past. No, the server was running with self-sign certification successfully. To fix this error we should to do as the following, I think:
- add the content of the CA which was generated by volcano into the trust files on the nodes that kube-apiserver is running.
- restart all kube-apiserver pod in the cluster.
PS,
Can volcano add the CA into the trusted CA on the nodes automatically when we deploy the volcano? How to do if it is true?
kube-apiserver use mutatingwebhookconfiguration/validatingwebhookconfiguration to accsee the admission server, and the configuration has already included the CA bundle generated by volcano: )
@GhangZh I have encountered same issue, do you have already solved it now?
I suddenly thought of a possibility. Could it be a network problem between the volcano-admission pod and other volcano pods?
root@VM-16-7-ubuntu:~# kubectl get pods -nvolcano-system -owide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
volcano-admission-7f4fcd89b4-758h5 1/1 Running 0 56s 10.6.1.3 cluster1-worker <none> <none>
volcano-admission-init-7tvzn 0/1 Completed 0 56s 10.6.2.3 cluster1-worker2 <none> <none>
volcano-controllers-6fb4668949-jpk7j 1/1 Running 0 56s 10.6.2.2 cluster1-worker2 <none> <none>
volcano-scheduler-7f6f746f98-2xvk8 1/1 Running 0 56s 10.6.1.2 cluster1-worker <none> <none>
Also I have noticed that there are massive errors inside the kube-proxy with same node as the k8s-apiserver:
I0618 03:39:48.050278 1 proxier.go:854] "Sync failed" retryingTime="30s"
E0618 03:40:18.202297 1 proxier.go:1546] "Failed to execute iptables-restore" err=<
exit status 4: iptables-restore v1.8.7 (nf_tables):
line 2080: CHAIN_USER_DEL failed (Device or resource busy): chain KUBE-SEP-XXX
So is that possible this is the root cause for this issue ?
What happened: The volcano webhook often reports the following error
What you expected to happen: No error
Environment:
kubectl version
): 1.24uname -a
):