Closed donnlee closed 3 years ago
@donnlee thanks for reporting this. I'm afraid this particular guide is no longer actively maintained. What is it that you wanted to learn from this guide? I'd be happy to speak on Slack and help you out with any specific questions.
tl;dr If you change 'reboot-strategy' to something other than 'off' (I used 'etcd-lock'), you will fix this issue.
Problem: 'vagrant up' will sometimes result in a wedged kube-0x VM because locksmithd is masked and stuck. 'kubectl get nodes' shows only 1 node is up.
Wedged install-kubernetes unit on kube-02 looks like:
core@kube-02 ~ $ systemctl list-units | egrep "install-k|locksm" install-kubernetes.service loaded activating start start Download Kubernetes Binaries ● locksmithd.service masked active running locksmithd.service
kube-02 not an active node:
$ vagrant ssh kube-01 CoreOS stable (1068.10.0) Last login: Fri Aug 26 20:07:18 2016 from 10.0.2.2 Update Strategy: No Reboots core@kube-01 ~ $ kubectl get nodes NAME LABELS STATUS 172.17.8.103 kubernetes.io/hostname=172.17.8.103 Ready core@kube-01 ~ $
Solution: Change reboot-strategy to 'etcd-lock' in https://github.com/weaveworks/guides/blob/master/kubernetes/coreos/kubernetes-cluster.yaml
Ref: https://github.com/coreos/bugs/issues/838