frankisinfotech / k8s-HA-Multi-Master-Node

10 stars 22 forks source link

Cluster did not came up again #1

Open n00bsi opened 6 months ago

n00bsi commented 6 months ago

Hi, thanks for your documetation.

had a running cluster, did shutdown all the node. then start all nodes but no node did start the docker containers again so no cluster is running:

kubectl get nodes

The connection to the server 192.168.122.210:6443 was refused - did you specify the right host or port?

on all nodes:

docker ps

CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES

What did I wrong ?

frankisinfotech commented 6 months ago

Hello @n00bsi,

You have to restart the docker service on the worker nodes. Once done, you may want to rejoin the nodes to the cluster using the "kubeadm" rejoin command.

Let me know how it goes..

Regards

n00bsi commented 6 months ago

Hi @frankisinfotech

thanks for you quick respond!

192.168.1.200 lb1 192.168.1.201 lb2 virtual IP 192.168.1.210 build with keepalived and haproxy

# kubectl get nodes
NAME     STATUS   ROLES           AGE   VERSION
k8sn12   Ready    control-plane   48m   v1.29.2
k8sn13   Ready    control-plane   42m   v1.29.2
k8sn14   Ready    control-plane   41m   v1.29.2
k8sn15   Ready    worker          40m   v1.29.2
k8sn16   Ready    worker          40m   v1.29.2
k8sn17   Ready    worker          40m   v1.29.2
# kubectl get cs
Warning: v1 ComponentStatus is deprecated in v1.19+
NAME                 STATUS    MESSAGE   ERROR
scheduler            Healthy   ok        
controller-manager   Healthy   ok        
etcd-0               Healthy   ok        

# kubectl cluster-info
Kubernetes control plane is running at https://192.168.122.210:6443
CoreDNS is running at https://192.168.122.210:6443/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy

To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.
# docker ps
CONTAINER ID   IMAGE                       COMMAND                  CREATED             STATUS             PORTS     NAMES
684e68f89653   cbb01a7bd410                "/coredns -conf /etc…"   About an hour ago   Up About an hour             k8s_coredns_coredns-76f75df574-qqpwk_kube-system_9fa9f8cd-4c5b-42e8-afb1-5b2242575b60_0
aef9ab6ee3d2   cbb01a7bd410                "/coredns -conf /etc…"   About an hour ago   Up About an hour             k8s_coredns_coredns-76f75df574-97vns_kube-system_947c4140-f628-430e-a513-29c43a353b13_0
84dab5b0494e   registry.k8s.io/pause:3.9   "/pause"                 About an hour ago   Up About an hour             k8s_POD_coredns-76f75df574-97vns_kube-system_947c4140-f628-430e-a513-29c43a353b13_12
0cdee4782e43   registry.k8s.io/pause:3.9   "/pause"                 About an hour ago   Up About an hour             k8s_POD_coredns-76f75df574-qqpwk_kube-system_9fa9f8cd-4c5b-42e8-afb1-5b2242575b60_13
0e7bc9bcb854   calico/node                 "start_runit"            About an hour ago   Up About an hour             k8s_calico-node_calico-node-rdfkb_kube-system_6b99c65f-ea04-48a2-8099-664da96bd6ac_0
d75d7c9685e6   registry.k8s.io/pause:3.9   "/pause"                 About an hour ago   Up About an hour             k8s_POD_calico-node-rdfkb_kube-system_6b99c65f-ea04-48a2-8099-664da96bd6ac_0
e15ab14ecb2f   9344fce2372f                "/usr/local/bin/kube…"   About an hour ago   Up About an hour             k8s_kube-proxy_kube-proxy-qzppg_kube-system_1b9bdc1f-8cf1-46b3-8e0f-64baf246ea00_0
4cba28443219   registry.k8s.io/pause:3.9   "/pause"                 About an hour ago   Up About an hour             k8s_POD_kube-proxy-qzppg_kube-system_1b9bdc1f-8cf1-46b3-8e0f-64baf246ea00_0
e0b2a304043f   a0eed15eed44                "etcd --advertise-cl…"   About an hour ago   Up About an hour             k8s_etcd_etcd-k8sn12_kube-system_4f51db15b6b484d2460b7869e55f45b5_0
088bb3cd6314   6fc5e6b7218c                "kube-scheduler --au…"   About an hour ago   Up About an hour             k8s_kube-scheduler_kube-scheduler-k8sn12_kube-system_7278e1f7561308c569511d3f5be72084_0
fb82bb8bd830   8a9000f98a52                "kube-apiserver --ad…"   About an hour ago   Up About an hour             k8s_kube-apiserver_kube-apiserver-k8sn12_kube-system_126bc6088a3f6b632c5e4a2f70040730_0
304cc03cbe15   138fb5a3a2e3                "kube-controller-man…"   About an hour ago   Up About an hour             k8s_kube-controller-manager_kube-controller-manager-k8sn12_kube-system_dd4b62d7463f4ab6887163c1e1f91d75_0
afa0bbf34e02   registry.k8s.io/pause:3.9   "/pause"                 About an hour ago   Up About an hour             k8s_POD_kube-controller-manager-k8sn12_kube-system_dd4b62d7463f4ab6887163c1e1f91d75_0
20e6667d1db4   registry.k8s.io/pause:3.9   "/pause"                 About an hour ago   Up About an hour             k8s_POD_kube-apiserver-k8sn12_kube-system_126bc6088a3f6b632c5e4a2f70040730_0
3ac431adec77   registry.k8s.io/pause:3.9   "/pause"                 About an hour ago   Up About an hour             k8s_POD_etcd-k8sn12_kube-system_4f51db15b6b484d2460b7869e55f45b5_0
83bfc88faf71   registry.k8s.io/pause:3.9   "/pause"                 About an hour ago   Up About an hour             k8s_POD_kube-scheduler-k8sn12_kube-system_7278e1f7561308c569511d3f5be72084_0

But when shutdown all my 7 VMs and start the VMs again then there no coinainers run...

on all nodes:

# docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
n00bsi commented 6 months ago

If I shutdown only one control-plane and start it again:

# kubectl get nodes
NAME     STATUS     ROLES           AGE   VERSION
k8sn12   NotReady   control-plane   87m   v1.29.2
k8sn13   Ready      control-plane   80m   v1.29.2
k8sn14   Ready      control-plane   79m   v1.29.2
k8sn15   Ready      worker          78m   v1.29.2
k8sn16   Ready      worker          78m   v1.29.2
k8sn17   Ready      worker          78m   v1.29.2

node: k8sn12


# docker ps
CONTAINER ID   IMAGE     COMMAND   CREATED   STATUS    PORTS     NAMES

on the node: k8sn12 is the same no docker startet after reboot

# docker ps -a
CONTAINER ID   IMAGE                       COMMAND                  CREATED             STATUS                         PORTS     NAMES
684e68f89653   cbb01a7bd410                "/coredns -conf /etc…"   About an hour ago   Exited (0) 4 minutes ago                 k8s_coredns_coredns-76f75df574-qqpwk_kube-system_9fa9f8cd-4c5b-42e8-afb1-5b2242575b60_0
aef9ab6ee3d2   cbb01a7bd410                "/coredns -conf /etc…"   About an hour ago   Exited (0) 4 minutes ago                 k8s_coredns_coredns-76f75df574-97vns_kube-system_947c4140-f628-430e-a513-29c43a353b13_0
84dab5b0494e   registry.k8s.io/pause:3.9   "/pause"                 About an hour ago   Exited (0) 4 minutes ago                 k8s_POD_coredns-76f75df574-97vns_kube-system_947c4140-f628-430e-a513-29c43a353b13_12
0cdee4782e43   registry.k8s.io/pause:3.9   "/pause"                 About an hour ago   Exited (0) 4 minutes ago                 k8s_POD_coredns-76f75df574-qqpwk_kube-system_9fa9f8cd-4c5b-42e8-afb1-5b2242575b60_13
0e7bc9bcb854   calico/node                 "start_runit"            About an hour ago   Exited (0) 4 minutes ago                 k8s_calico-node_calico-node-rdfkb_kube-system_6b99c65f-ea04-48a2-8099-664da96bd6ac_0
2fbb1d07a8d0   calico/pod2daemon-flexvol   "/usr/local/bin/flex…"   About an hour ago   Exited (0) About an hour ago             k8s_flexvol-driver_calico-node-rdfkb_kube-system_6b99c65f-ea04-48a2-8099-664da96bd6ac_0
3ad86fd9269c   42b7a3f2bfdf                "/install-cni.sh"        About an hour ago   Exited (0) About an hour ago             k8s_install-cni_calico-node-rdfkb_kube-system_6b99c65f-ea04-48a2-8099-664da96bd6ac_0
24a3ccac6ccc   calico/cni                  "/opt/cni/bin/calico…"   About an hour ago   Exited (0) About an hour ago             k8s_upgrade-ipam_calico-node-rdfkb_kube-system_6b99c65f-ea04-48a2-8099-664da96bd6ac_0
d75d7c9685e6   registry.k8s.io/pause:3.9   "/pause"                 About an hour ago   Exited (0) 4 minutes ago                 k8s_POD_calico-node-rdfkb_kube-system_6b99c65f-ea04-48a2-8099-664da96bd6ac_0
e15ab14ecb2f   9344fce2372f                "/usr/local/bin/kube…"   About an hour ago   Exited (2) 4 minutes ago                 k8s_kube-proxy_kube-proxy-qzppg_kube-system_1b9bdc1f-8cf1-46b3-8e0f-64baf246ea00_0
4cba28443219   registry.k8s.io/pause:3.9   "/pause"                 About an hour ago   Exited (0) 4 minutes ago                 k8s_POD_kube-proxy-qzppg_kube-system_1b9bdc1f-8cf1-46b3-8e0f-64baf246ea00_0
e0b2a304043f   a0eed15eed44                "etcd --advertise-cl…"   2 hours ago         Exited (0) 4 minutes ago                 k8s_etcd_etcd-k8sn12_kube-system_4f51db15b6b484d2460b7869e55f45b5_0
088bb3cd6314   6fc5e6b7218c                "kube-scheduler --au…"   2 hours ago         Exited (0) 4 minutes ago                 k8s_kube-scheduler_kube-scheduler-k8sn12_kube-system_7278e1f7561308c569511d3f5be72084_0
fb82bb8bd830   8a9000f98a52                "kube-apiserver --ad…"   2 hours ago         Exited (137) 4 minutes ago               k8s_kube-apiserver_kube-apiserver-k8sn12_kube-system_126bc6088a3f6b632c5e4a2f70040730_0
304cc03cbe15   138fb5a3a2e3                "kube-controller-man…"   2 hours ago         Exited (2) 4 minutes ago                 k8s_kube-controller-manager_kube-controller-manager-k8sn12_kube-system_dd4b62d7463f4ab6887163c1e1f91d75_0
afa0bbf34e02   registry.k8s.io/pause:3.9   "/pause"                 2 hours ago         Exited (0) 4 minutes ago                 k8s_POD_kube-controller-manager-k8sn12_kube-system_dd4b62d7463f4ab6887163c1e1f91d75_0
20e6667d1db4   registry.k8s.io/pause:3.9   "/pause"                 2 hours ago         Exited (0) 4 minutes ago                 k8s_POD_kube-apiserver-k8sn12_kube-system_126bc6088a3f6b632c5e4a2f70040730_0
3ac431adec77   registry.k8s.io/pause:3.9   "/pause"                 2 hours ago         Exited (0) 4 minutes ago                 k8s_POD_etcd-k8sn12_kube-system_4f51db15b6b484d2460b7869e55f45b5_0
83bfc88faf71   registry.k8s.io/pause:3.9   "/pause"                 2 hours ago         Exited (0) 4 minutes ago                 k8s_POD_kube-scheduler-k8sn12_kube-system_7278e1f7561308c569511d3f5be72084_0
# kubectl get pods -n kube-system
NAME                                       READY   STATUS             RESTARTS         AGE
calico-kube-controllers-67967d7dc6-smlzs   0/1     CrashLoopBackOff   20 (4m26s ago)   91m
calico-node-8bhqq                          1/1     Running            0                86m
calico-node-fjc8n                          1/1     Running            0                86m
calico-node-gp2wc                          1/1     Running            0                88m
calico-node-rdfkb                          1/1     Running            0                91m
calico-node-tc8jn                          1/1     Running            0                86m
calico-node-xqblm                          1/1     Running            0                87m
coredns-76f75df574-72g76                   0/1     Running            0                4m14s
coredns-76f75df574-97vns                   1/1     Terminating        0                94m
coredns-76f75df574-fxslf                   0/1     Running            0                4m14s
coredns-76f75df574-qqpwk                   1/1     Terminating        0                94m
etcd-k8sn12                                1/1     Running            0                94m
etcd-k8sn13                                1/1     Running            0                88m
etcd-k8sn14                                1/1     Running            0                87m
kube-apiserver-k8sn12                      1/1     Running            0                94m
kube-apiserver-k8sn13                      1/1     Running            0                88m
kube-apiserver-k8sn14                      1/1     Running            0                87m
kube-controller-manager-k8sn12             1/1     Running            0                94m
kube-controller-manager-k8sn13             1/1     Running            0                88m
kube-controller-manager-k8sn14             1/1     Running            0                87m
kube-proxy-7lcpc                           1/1     Running            0                86m
kube-proxy-87lkc                           1/1     Running            0                86m
kube-proxy-ctn9r                           1/1     Running            0                87m
kube-proxy-kngqb                           1/1     Running            0                88m
kube-proxy-qqmr9                           1/1     Running            0                86m
kube-proxy-qzppg                           1/1     Running            0                94m
kube-scheduler-k8sn12                      1/1     Running            0                94m
kube-scheduler-k8sn13                      1/1     Running            0                88m
kube-scheduler-k8sn14                      1/1     Running            0                87m
# journalctl -u kubelet
-- No entries --
n00bsi commented 6 months ago

I belive I found it

the kubele.sevice was not enabled

frankisinfotech commented 6 months ago

I belive I found it

the kubele.sevice was not enabled

Ahhhhhh....Nice one! Kubelet service wasn't running... Nice!

n00bsi commented 6 months ago

@frankisinfotech

But not successful... had now the issue that - not all is comming up

# ss -tulpen | grep 6443
tcp   LISTEN 0      4096                 *:6443             *:*    users:(("kube-apiserver",pid=2010,fd=3)) ino:27982 sk:100b cgroup:/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod126bc6088a3f6b632c5e4a2f70040730.slice/docker-7ed84d704ec4846eb8c8c2ff0574c814594d7d728b549af5def9a9a154fa67af.scope v6only:0 <->

# kubectl get nodes
The connection to the server 192.168.122.210:6443 was refused - did you specify the right host or port?

# ping 192.168.122.210
PING 192.168.122.210 (192.168.122.210) 56(84) bytes of data.
64 bytes from 192.168.122.210: icmp_seq=1 ttl=64 time=0.434 ms
64 bytes from 192.168.122.210: icmp_seq=2 ttl=64 time=0.474 ms
^C

# docker ps
CONTAINER ID   IMAGE                       COMMAND                  CREATED         STATUS         PORTS     NAMES
8ac7b0687d34   138fb5a3a2e3                "kube-controller-man…"   3 minutes ago   Up 3 minutes             k8s_kube-controller-manager_kube-controller-manager-k8sn12_kube-system_dd4b62d7463f4ab6887163c1e1f91d75_3
9824e1028a69   a0eed15eed44                "etcd --advertise-cl…"   3 minutes ago   Up 3 minutes             k8s_etcd_etcd-k8sn12_kube-system_4f51db15b6b484d2460b7869e55f45b5_3
8bdf2a8adcca   6fc5e6b7218c                "kube-scheduler --au…"   3 minutes ago   Up 3 minutes             k8s_kube-scheduler_kube-scheduler-k8sn12_kube-system_7278e1f7561308c569511d3f5be72084_3
7ed84d704ec4   8a9000f98a52                "kube-apiserver --ad…"   3 minutes ago   Up 3 minutes             k8s_kube-apiserver_kube-apiserver-k8sn12_kube-system_126bc6088a3f6b632c5e4a2f70040730_3
d1b6af215edb   registry.k8s.io/pause:3.9   "/pause"                 3 minutes ago   Up 3 minutes             k8s_POD_kube-scheduler-k8sn12_kube-system_7278e1f7561308c569511d3f5be72084_3
9d968141de9a   registry.k8s.io/pause:3.9   "/pause"                 3 minutes ago   Up 3 minutes             k8s_POD_etcd-k8sn12_kube-system_4f51db15b6b484d2460b7869e55f45b5_3
350f957c1a37   registry.k8s.io/pause:3.9   "/pause"                 3 minutes ago   Up 3 minutes             k8s_POD_kube-controller-manager-k8sn12_kube-system_dd4b62d7463f4ab6887163c1e1f91d75_3
1994cdda100a   registry.k8s.io/pause:3.9   "/pause"                 3 minutes ago   Up 3 minutes             k8s_POD_kube-apiserver-k8sn12_kube-system_126bc6088a3f6b632c5e4a2f70040730_3

When change: .210 to .202 in /etc/kubernetes/admin.conf

then I got this:

# kubectl get nodes
NAME     STATUS     ROLES           AGE    VERSION
k8sn12   NotReady   control-plane   126m   v1.29.2
k8sn13   NotReady   control-plane   119m   v1.29.2
k8sn14   NotReady   control-plane   118m   v1.29.2
k8sn15   NotReady   worker          117m   v1.29.2
k8sn16   NotReady   worker          117m   v1.29.2
k8sn17   NotReady   worker          117m   v1.29.2

# kubectl get cs
Warning: v1 ComponentStatus is deprecated in v1.19+
NAME                 STATUS    MESSAGE   ERROR
controller-manager   Healthy   ok        
scheduler            Healthy   ok        
etcd-0               Healthy   ok        

# kubectl cluster-info
Kubernetes control plane is running at https://192.168.122.202:6443
CoreDNS is running at https://192.168.122.202:6443/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy

To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.

what did I wrong at the configure ?

n00bsi commented 6 months ago

found it!

haproxy was not enabled at both LBs - sorry!

frankisinfotech commented 6 months ago

found it!

haproxy was not enabled at both LBs - sorry!

Cool....You're doing great! Please share my channel with your network. i need more subscribers... Thank you