0.8.0 fails after reboot

joshuacox commented 8 years ago

I know this is already pointed out in the documentation, but since there is a thread on .7:

https://github.com/luxas/kubernetes-on-arm/issues/112

I thought I'd start one here, I mean we ought to at least document what to do after a restart, right now I'm doing this on the master:

kube-config disable
kube-config enable-master

and then on each worker I do nearly the same thing:

kube-config disable
kube-config enable-worker 192.168.1.100

now back on the master everything looks happy:

# kubectl cluster-info
Kubernetes master is running at http://localhost:8080
KubeDNS is running at http://localhost:8080/api/v1/proxy/namespaces/kube-system/services/kube-dns
kubernetes-dashboard is running at http://localhost:8080/api/v1/proxy/namespaces/kube-system/services/kubernetes-dashboard

To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.

I have been letting the disable command wipe out the kubelet directory, anyone have experience persisting everything?

joshuacox commented 8 years ago

I'm doing this daily now so Iadded this little ditty to do it all in one (notice it waits for 8080 to come up before kicking off the addons):

#!/bin/bash
yes Y|kube-config disable
kube-config enable-master
while ! curl --output /dev/null --silent --head --fail http://localhost:8080; do sleep 2 && echo -n .; done;
echo -n 'Kube server up!'
sleep 1;echo -n '!'; sleep 1;echo -n '!';sleep 1;echo -n '!';
sleep 1;echo '!'
kube-config enable-addon registry
kube-config enable-addon loadbalancer
kube-config enable-addon heapster

joshuacox commented 8 years ago

btw, some power supplies (even if marked 2.5 A or higher) just suck at powering an rPi. swapping out a pesky on that happened to be attached to my master elilminated my daily restart issue.

I'm going to close this issue to clear up the open ones a bit. Feel free to add notes for anyone else here, I might not be doing the most 'optimal' route by just clearing out and restarting with yes Y|kube-config disable, and I'm open to discussion.

That being said, I'll close this up and leave what a dead k8s looks like after a reboot and right before I run the above script, just in case it helps some person get here by searching

[root@apollo ~]# docker ps
CONTAINER ID        IMAGE                                                COMMAND                  CREATED             STATUS              
PORTS               NAMES
a1a823f1d169        gcr.io/google_containers/hyperkube-arm:v1.3.6        "/setup-files.sh IP:1"   15 hours ago        Up 10 hours                             
k8s_setup.e7e837b8_k8s-master-192.168.2.100_kube-system_ae686c7fc175b500b899801f6c8067a6_cad55286
7e85cd7d1574        gcr.io/google_containers/hyperkube-arm:v1.3.6        "/hyperkube scheduler"   15 hours ago        Up 15 hours                             
k8s_scheduler.e737fc67_k8s-master-192.168.2.100_kube-system_ae686c7fc175b500b899801f6c8067a6_d3fed3e6
deb3851baeb0        gcr.io/google_containers/hyperkube-arm:v1.3.6        "/copy-addons.sh"        15 hours ago        Up 15 hours                             
k8s_kube-addon-manager-data.1fa1945_kube-addon-manager-192.168.2.100_kube-system_12985f6c2276246edebe61849d91e5be_082d9e4d
2f0c542d0db3        gcr.io/google-containers/kube-addon-manager-arm:v4   "/opt/kube-addons.sh"    15 hours ago        Up 15 hours                             
k8s_kube-addon-manager.f42a8c48_kube-addon-manager-192.168.2.100_kube-system_12985f6c2276246edebe61849d91e5be_a491b356
e0d40ca05b11        gcr.io/google_containers/hyperkube-arm:v1.3.6        "/hyperkube proxy --m"   15 hours ago        Up 15 hours                             
k8s_kube-proxy.3d0a47fc_k8s-proxy-192.168.2.100_kube-system_a1f94bf6df71ad6ef8dd95542737efd3_cdf57f46
2a0834d87e8c        gcr.io/google_containers/pause-arm:3.0               "/pause"                 15 hours ago        Up 15 hours                             
k8s_POD.da6fe110_k8s-master-192.168.2.100_kube-system_ae686c7fc175b500b899801f6c8067a6_83921def
0cd2415df5be        gcr.io/google_containers/pause-arm:3.0               "/pause"                 15 hours ago        Up 15 hours                             
k8s_POD.da6fe110_kube-addon-manager-192.168.2.100_kube-system_12985f6c2276246edebe61849d91e5be_ebb792b7
e2f3b1e37b6c        gcr.io/google_containers/pause-arm:3.0               "/pause"                 15 hours ago        Up 15 hours                             
k8s_POD.da6fe110_k8s-proxy-192.168.2.100_kube-system_a1f94bf6df71ad6ef8dd95542737efd3_49e6f04f
90f2dd9ca1ea        gcr.io/google_containers/hyperkube-arm:v1.3.6        "/hyperkube kubelet -"   3 days ago          Up 15 hours                             
kube_kubelet_2cc6a
[root@apollo ~]# kubectl get nodes
The connection to the server localhost:8080 was refused - did you specify the right host or port?

luxas / kubernetes-on-arm

0.8.0 fails after reboot #128