nwcdheap / kops-cn

AWS中国宁夏区域/北京区域,快速Kops部署K8S集群
Apache License 2.0
121 stars 74 forks source link

Update default image in bjs #84

Closed jansony1 closed 5 years ago

jansony1 commented 5 years ago

Hi

My customer and i found out that the new version deployment in bjs would cause kube-dns continuously pending or error.

After checked kubed-dns pods itself.

  Normal   Scheduled               23m                default-scheduler                                    Successfully assigned kube-system/kube-dns-cfbbccd4c-62cfk to ip-10-0-94-194.cn-north-1.compute.internal
  Warning  FailedCreatePodSandBox  23m                kubelet, ip-10-0-94-194.cn-north-1.compute.internal  Failed create pod sandbox: rpc error: code = Unknown desc = [failed to set up sandbox container "fc349808fcd3751e8378a01577f4b37c9f1a54192005623b6aeb097d45c01c97" network for pod "kube-dns-cfbbccd4c-62cfk": NetworkPlugin cni failed to set up pod "kube-dns-cfbbccd4c-62cfk_kube-system" network: rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection error: desc = "transport: Error while dialing dial tcp [::1]:50051: connect: connection refused", failed to clean up sandbox container "fc349808fcd3751e8378a01577f4b37c9f1a54192005623b6aeb097d45c01c97" network for pod "kube-dns-cfbbccd4c-62cfk": NetworkPlugin cni failed to teardown pod "kube-dns-cfbbccd4c-62cfk_kube-system" network: rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection error: desc = "transport: Error while dialing dial tcp [::1]:50051: connect: connection refused"]
  Normal   SandboxChanged          3m (x95 over 23m)  kubelet, ip-10-0-94-194.cn-north-1.compute.internal  Pod sandbox changed, it will be killed and re-created.
  Normal   SandboxChanged          6s (x7 over 1m)    kubelet, ip-10-0-94-194.cn-north-1.compute.internal  Pod sandbox changed, it will be killed and re-created.

I guess it may some error caused by kubelet in the default image so that i change the default image according to https://coreos.com/os/docs/latest/booting-on-ec2.html

Everything works fine for mutiple times.

pahud commented 5 years ago

@jansony1 thanks