kubectl get nodes on the master does not list one or more worker nodes.
Issue can be reproduced intermittently.
Restarting kubelet and/or kube-apiserver does not help.
This isn't a transient failure. The worker nodes are never able to update the status. They never show up in kubectl get nodes.
Setup details:
3 worker nodes, 1 master node, 1 etcd node
All nodes run CoreOS-stable-1409.6.0-hvm (ami-00110279)
Issue can be reproduced with Kubernetes 1.6.4 as well as 1.7.0.
Issue can be reproduced with etcd 3.5.4 as well as 2.7.* (older version).
kubelet on the worker nodes fails to update the worker node status after claiming to have registered successfully:
kubelet-wrapper[1657]: I0804 16:42:15.216223 1657 kubelet_node_status.go:77] Attempting to register node 172.0.60.57
kubelet-wrapper[1657]: I0804 16:42:15.218882 1657 kubelet_node_status.go:80] Successfully registered node 172.0.60.57
kubelet-wrapper[1657]: E0804 16:42:25.230766 1657 kubelet_node_status.go:326] Error updating node status, will retry: error getting node "172.0.60.57": nodes "172.0.60.57" not found
kubelet-wrapper[1657]: E0804 16:42:25.232449 1657 kubelet_node_status.go:326] Error updating node status, will retry: error getting node "172.0.60.57": nodes "172.0.60.57" not found
Looking at the Kubernetes API server logs reveals the fact that there is a conflict while updating the node in etcd, due to which the API server deletes the node:
Issue Details:
kubectl get nodes
on the master does not list one or more worker nodes.kubectl get nodes
.Setup details:
kubelet on the worker nodes fails to update the worker node status after claiming to have registered successfully:
Looking at the Kubernetes API server logs reveals the fact that there is a conflict while updating the node in etcd, due to which the API server deletes the node: