aledbf / kube-keepalived-vip

Kubernetes Virtual IP address/es using keepalived
Apache License 2.0
188 stars 75 forks source link

kube-keepalived-vip sutck in RunContainerError state #104

Open patrickppeng opened 5 years ago

patrickppeng commented 5 years ago

We have seen this many times that the keepalived pod got into RunContainerError. Also the liveness probe kept failing with unknown reason that caused k8s restarting the pods (not owning the VIP) frequently.

common-services-kube-keepalived-vip-4frnl 1/1 Running 0 3d23h common-services-kube-keepalived-vip-5tqc2 0/1 RunContainerError 215 3d23h common-services-kube-keepalived-vip-dqnxn 1/1 Running 69 3d23h

Normal SandboxChanged 6m22s (x97594 over 2d15h) kubelet, c2df1e48-b3df-46d1-bd22-1e32010eeb0d Pod sandbox changed, it will be killed and re-created. Warning FailedCreatePodSandBox 81s (x96282 over 2d15h) kubelet, c2df1e48-b3df-46d1-bd22-1e32010eeb0d Failed create pod sandbox: rpc error: code = Unknown desc = failed to start sandbox container for pod "common-services-kube-keepalived-vip-5tqc2": Error response from daemon: OCI runtime create failed: container_linux.go:348: starting container process caused "process_linux.go:301: running exec setns process for init caused \"signal: killed\"": unknown

[root@sv-ccm1 cust]# kubectl version Client Version: version.Info{Major:"1", Minor:"13", GitVersion:"v1.13.2", GitCommit:"cff46ab41ff0bb44d8584413b598ad8360ec1def", GitTreeState:"clean", BuildDate:"2019-01-10T23:35:51Z", GoVersion:"go1.11.4", Compiler:"gc", Platform:"linux/amd64"} Server Version: version.Info{Major:"1", Minor:"14", GitVersion:"v1.14.1", GitCommit:"b7394102d6ef778017f2ca4046abbaa23b88c290", GitTreeState:"clean", BuildDate:"2019-04-08T17:02:58Z", GoVersion:"go1.12.1", Compiler:"gc", Platform:"linux/amd64"}

panpan0000 commented 5 years ago

I believe it's should not be a kube-keepalived-vip problem. RunContainerError usually the docker-daemon issues. looking from the internet, trying to free node memory cache and restart docker-daemon ? https://github.com/opencontainers/runc/issues/1343 https://github.com/opencontainers/runc/issues/1740