Hi, thanks for providing this template! It works great right after deployment but k8s seems to be half-broken after a simple instance reboot. I tried a couple of deployments and it's perfectly reproducible.
After deployment (before reboot)
There are 20 containers running
root@ip-172-30-2-247 ~ # docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
72772b50ff39 eb516548c180 "/coredns -conf /e..." 4 minutes ago Up 4 minutes k8s_coredns_coredns-fb8b8dccf-46ttk_kube-system_495e0b66-87f1-11e9-af26-0a1ee8c58078_0
74071e694ec0 eb516548c180 "/coredns -conf /e..." 4 minutes ago Up 4 minutes k8s_coredns_coredns-fb8b8dccf-b9mhj_kube-system_4963984e-87f1-11e9-af26-0a1ee8c58078_0
3ae3ff9e9c73 k8s.gcr.io/pause:3.1 "/pause" 4 minutes ago Up 4 minutes k8s_POD_coredns-fb8b8dccf-b9mhj_kube-system_4963984e-87f1-11e9-af26-0a1ee8c58078_43
df8b0eef7ec3 k8s.gcr.io/pause:3.1 "/pause" 4 minutes ago Up 4 minutes k8s_POD_coredns-fb8b8dccf-46ttk_kube-system_495e0b66-87f1-11e9-af26-0a1ee8c58078_41
623c4cc86d76 b4d7c4247c3a "start_runit" 4 minutes ago Up 4 minutes k8s_calico-node_calico-node-8rckj_kube-system_4958e121-87f1-11e9-af26-0a1ee8c58078_2
8106f5622be7 0bd1f99c7034 "/usr/bin/kube-con..." 4 minutes ago Up 4 minutes k8s_calico-kube-controllers_calico-kube-controllers-8649d847c4-29xss_kube-system_495de6cf-87f1-11e9-af26-0a1ee8c5807
7d70242f82fa quay.io/coreos/etcd@sha256:... "/usr/local/bin/et..." 4 minutes ago Up 4 minutes k8s_calico-etcd_calico-etcd-qzbks_kube-system_55247f87-87f1-11e9-af26-0a1ee8c58078_0
49475b87dfe6 k8s.gcr.io/pause:3.1 "/pause" 4 minutes ago Up 4 minutes k8s_POD_calico-etcd-qzbks_kube-system_55247f87-87f1-11e9-af26-0a1ee8c58078_0
ceb5a5d082bb k8s.gcr.io/pause:3.1 "/pause" 4 minutes ago Up 4 minutes k8s_POD_calico-kube-controllers-8649d847c4-29xss_kube-system_495de6cf-87f1-11e9-af26-0a1ee8c58078_0
9529052099c3 20a2d7035165 "/usr/local/bin/ku..." 5 minutes ago Up 5 minutes k8s_kube-proxy_kube-proxy-fx548_kube-system_4958d277-87f1-11e9-af26-0a1ee8c58078_0
aea199144e04 k8s.gcr.io/pause:3.1 "/pause" 5 minutes ago Up 5 minutes k8s_POD_calico-node-8rckj_kube-system_4958e121-87f1-11e9-af26-0a1ee8c58078_0
e9f5fe880890 k8s.gcr.io/pause:3.1 "/pause" 5 minutes ago Up 5 minutes k8s_POD_kube-proxy-fx548_kube-system_4958d277-87f1-11e9-af26-0a1ee8c58078_0
a40f97c206e8 2c4adeb21b4f "etcd --advertise-..." 5 minutes ago Up 5 minutes k8s_etcd_etcd-ip-172-30-2-247.ap-southeast-2.compute.internal_kube-system_cd3d6cd87a522a8d47f9f84a29a21085_0
f07f1230fcbe 8931473d5bdb "kube-scheduler --..." 5 minutes ago Up 5 minutes k8s_kube-scheduler_kube-scheduler-ip-172-30-2-247.ap-southeast-2.compute.internal_kube-system_f44110a0ca540009109bfc
a8bb045ae863 cfaa4ad74c37 "kube-apiserver --..." 5 minutes ago Up 5 minutes k8s_kube-apiserver_kube-apiserver-ip-172-30-2-247.ap-southeast-2.compute.internal_kube-system_5d623fb4138e843edbe51b8557fa2cac2c efb3887b411d "kube-controller-m..." 5 minutes ago Up 5 minutes k8s_kube-controller-manager_kube-controller-manager-ip-172-30-2-247.ap-southeast-2.compute.internal_kube-system_4d4b
cec3e482bb98 k8s.gcr.io/pause:3.1 "/pause" 5 minutes ago Up 5 minutes k8s_POD_kube-controller-manager-ip-172-30-2-247.ap-southeast-2.compute.internal_kube-system_4d4b59c11383339b1dbc695725a629730365 k8s.gcr.io/pause:3.1 "/pause" 5 minutes ago Up 5 minutes k8s_POD_kube-scheduler-ip-172-30-2-247.ap-southeast-2.compute.internal_kube-system_f44110a0ca540009109bfc32a7eb0baa_
25618cb3d3b5 k8s.gcr.io/pause:3.1 "/pause" 5 minutes ago Up 5 minutes k8s_POD_etcd-ip-172-30-2-247.ap-southeast-2.compute.internal_kube-system_cd3d6cd87a522a8d47f9f84a29a21085_0
a45912a2e9b0 k8s.gcr.io/pause:3.1 "/pause" 5 minutes ago Up 5 minutes k8s_POD_kube-apiserver-ip-172-30-2-247.ap-southeast-2.compute.internal_kube-system_5d623fb4138e843edbe51bb363cb7fdc_
And kubectl works:
root@ip-172-30-2-247 ~ # kubectl --kubeconfig /etc/kubernetes/admin.conf get all --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system pod/calico-etcd-qzbks 1/1 Running 0 2m30s
kube-system pod/calico-kube-controllers-8649d847c4-29xss 1/1 Running 1 2m50s
kube-system pod/calico-node-8rckj 1/1 Running 2 2m50s
kube-system pod/coredns-fb8b8dccf-46ttk 1/1 Running 0 2m50s
kube-system pod/coredns-fb8b8dccf-b9mhj 1/1 Running 0 2m50s
kube-system pod/etcd-ip-172-30-2-247.ap-southeast-2.compute.internal 1/1 Running 0 117s
kube-system pod/kube-apiserver-ip-172-30-2-247.ap-southeast-2.compute.internal 1/1 Running 0 101s
kube-system pod/kube-controller-manager-ip-172-30-2-247.ap-southeast-2.compute.internal 1/1 Running 0 2m1s
kube-system pod/kube-proxy-fx548 1/1 Running 0 2m50s
kube-system pod/kube-scheduler-ip-172-30-2-247.ap-southeast-2.compute.internal 1/1 Running 0 106s
NAMESPACE NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
default service/kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 2m57s
kube-system service/calico-etcd ClusterIP 10.96.232.136 <none> 6666/TCP 2m55s
kube-system service/kube-dns ClusterIP 10.96.0.10 <none> 53/UDP,53/TCP,9153/TCP 2m56s
NAMESPACE NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
kube-system daemonset.apps/calico-etcd 1 1 1 1 1 <none> 2m55s
kube-system daemonset.apps/calico-node 1 1 1 1 1 beta.kubernetes.io/os=linux 2m55s
kube-system daemonset.apps/kube-proxy 1 1 1 1 1 <none> 2m55s
NAMESPACE NAME READY UP-TO-DATE AVAILABLE AGE
kube-system deployment.apps/calico-kube-controllers 1/1 1 1 2m55s
kube-system deployment.apps/coredns 2/2 2 2 2m56s
NAMESPACE NAME DESIRED CURRENT READY AGE
kube-system replicaset.apps/calico-kube-controllers-8649d847c4 1 1 1 2m50s
kube-system replicaset.apps/coredns-fb8b8dccf 2 2 2 2m50s
After reboot
Only 6 containers come up:
root@ip-172-30-2-247 ~ # docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
ae9eadd98658 cfaa4ad74c37 "kube-apiserver --..." 19 seconds ago Up 19 seconds k8s_kube-apiserver_kube-apiserver-ip-172-30-2-247.ap-southeast-2.compute.internal_kube-system_5d623fb4138e843edbe51bb363cb7fdc_7
5fa7ba84ad9e efb3887b411d "kube-controller-m..." 13 minutes ago Up 13 minutes k8s_kube-controller-manager_kube-controller-manager-ip-172-30-2-247.ap-southeast-2.compute.internal_kube-system_4d4b59c11383339b1dbc6957db2b1aac_1
ef7970cc7a31 8931473d5bdb "kube-scheduler --..." 13 minutes ago Up 13 minutes k8s_kube-scheduler_kube-scheduler-ip-172-30-2-247.ap-southeast-2.compute.internal_kube-system_f44110a0ca540009109bfc32a7eb0baa_1
517eb8dc2228 k8s.gcr.io/pause:3.1 "/pause" 13 minutes ago Up 13 minutes k8s_POD_kube-controller-manager-ip-172-30-2-247.ap-southeast-2.compute.internal_kube-system_4d4b59c11383339b1dbc6957db2b1aac_1
e321d41d616f k8s.gcr.io/pause:3.1 "/pause" 13 minutes ago Up 13 minutes k8s_POD_kube-apiserver-ip-172-30-2-247.ap-southeast-2.compute.internal_kube-system_5d623fb4138e843edbe51bb363cb7fdc_1
76d753980f61 k8s.gcr.io/pause:3.1 "/pause" 13 minutes ago Up 13 minutes k8s_POD_kube-scheduler-ip-172-30-2-247.ap-southeast-2.compute.internal_kube-system_f44110a0ca540009109bfc32a7eb0baa_1
009531cebfa0 k8s.gcr.io/pause:3.1 "/pause" 13 minutes ago Up 13 minutes k8s_POD_etcd-ip-172-30-2-247.ap-southeast-2.compute.internal_kube-system_cd3d6cd87a522a8d47f9f84a29a21085_1
And kubectl doesn't work:
root@ip-172-30-2-247 ~ # kubectl --kubeconfig /etc/kubernetes/admin.conf get all
The connection to the server 172.30.2.247:6443 was refused - did you specify the right host or port?
I gave it more than enough time to come up but still no go. It looks like the issue is with the kube-apiserver that keeps starting and failing over and over again.
Unfortunately I'm not a big kubernetes expert so I don't know where to look to fix it.
Hi, thanks for providing this template! It works great right after deployment but k8s seems to be half-broken after a simple instance reboot. I tried a couple of deployments and it's perfectly reproducible.
After deployment (before reboot)
There are 20 containers running
And kubectl works:
After reboot
Only 6 containers come up:
And kubectl doesn't work:
I gave it more than enough time to come up but still no go. It looks like the issue is with the
kube-apiserver
that keeps starting and failing over and over again.Unfortunately I'm not a big kubernetes expert so I don't know where to look to fix it.
Any chance you could have a look at it?
Thanks!
Michael