weaveworks-guides / weave-net-legacy

Weave Net Old Guides
Apache License 2.0
88 stars 33 forks source link

kubernetes/coreos guide fails sometimes #216

Closed donnlee closed 3 years ago

donnlee commented 8 years ago

tl;dr If you change 'reboot-strategy' to something other than 'off' (I used 'etcd-lock'), you will fix this issue.

Problem: 'vagrant up' will sometimes result in a wedged kube-0x VM because locksmithd is masked and stuck. 'kubectl get nodes' shows only 1 node is up.

Wedged install-kubernetes unit on kube-02 looks like:

core@kube-02 ~ $ systemctl list-units | egrep "install-k|locksm" install-kubernetes.service loaded activating start start Download Kubernetes Binaries ● locksmithd.service masked active running locksmithd.service

kube-02 not an active node:

$ vagrant ssh kube-01 CoreOS stable (1068.10.0) Last login: Fri Aug 26 20:07:18 2016 from 10.0.2.2 Update Strategy: No Reboots core@kube-01 ~ $ kubectl get nodes NAME LABELS STATUS 172.17.8.103 kubernetes.io/hostname=172.17.8.103 Ready core@kube-01 ~ $

Solution: Change reboot-strategy to 'etcd-lock' in https://github.com/weaveworks/guides/blob/master/kubernetes/coreos/kubernetes-cluster.yaml

Ref: https://github.com/coreos/bugs/issues/838

errordeveloper commented 7 years ago

@donnlee thanks for reporting this. I'm afraid this particular guide is no longer actively maintained. What is it that you wanted to learn from this guide? I'd be happy to speak on Slack and help you out with any specific questions.