pires / kubernetes-vagrant-coreos-cluster

Kubernetes cluster (for testing purposes) made easy with Vagrant and CoreOS.
Apache License 2.0
597 stars 205 forks source link

Node become NotReady after vagrant reload #304

Open liubin opened 6 years ago

liubin commented 6 years ago

Hi, thanks for your work, I'm using the latest version of this repo and can worked when vagrant up, but after vagrant reload, node-01 and node-02 become not ready, and I found the log of kubelet container in node-02:

E0510 11:47:24.151857    1236 event.go:209] Unable to write event: 'Post https://__MASTER_IP__:443/api/v1/namespaces/default/events: dial tcp: lookup __MASTER_IP__ on 10.0.2.3:53: server misbehaving' (may retry after sleeping)

E0510 11:47:24.363225    1236 reflector.go:205] k8s.io/kubernetes/pkg/kubelet/kubelet.go:451: Failed to list *v1.Service: Get https://__MASTER_IP__:443/api/v1/services?limit=500&resourceVersion=0: dial tcp: lookup __MASTER_IP__ on 10.0.2.3:53: server misbehaving

It seems that the var is not replaced by the real value.

bmcustodio commented 6 years ago

@liubin I just tried with the following instructions and everything seems OK:

$ NODES=2 vagrant halt
$ NODES=2 vagrant up

This is equivalent to NODES=2 vagrant reload. Can you please provide the exact instructions you followed since creating the cluster?

liubin commented 6 years ago

I only did some vagrant reload or vagrant halt & vagrant up.

liubin commented 6 years ago

After some watches, I think the problem may be that the kubelet container started earlier than the MASTER_IP's replace.

I cant see the file /etc/kubernetes/node-kubeconfig.yaml has the correct ip of master, but kubelet's log show that it is still using the MASTER_IP, after restart the kubelet by docker restartt kubelet, the node becomes ready status.