Networking between hosts is not working properly

jheady / kubernetes_project

A project for building out a kubernetes cluster using packer, ansible, and terraform

0 stars 0 forks source link

Networking between hosts is not working properly #1

Closed jheady closed 3 months ago

jheady commented 1 year ago

In order to allow the hosts to communicate with each other properly, they need to communicate via DNS entries. This can be done by updating the /etc/hosts file for each server to include the IP and hostname for all the servers.

jheady commented 1 year ago

The servers in the cluster are unable to communicate with each other via IP or name resolution. Commit fb8bbea provides the /etc/hosts file update to allow the servers to see each other. They are getting host unreachable or no route to host errors.

ansible@kube-worker-node-1:~$ ping kube-master-node-1
PING kube-master-node-1 (192.168.122.150) 56(84) bytes of data.
From kube-worker-node-1 (192.168.122.15) icmp_seq=1 Destination Host Unreachable
From kube-worker-node-1 (192.168.122.15) icmp_seq=2 Destination Host Unreachable
From kube-worker-node-1 (192.168.122.15) icmp_seq=3 Destination Host Unreachable
^C
--- kube-master-node-1 ping statistics ---
5 packets transmitted, 0 received, +3 errors, 100% packet loss, time 4046ms

ansible@kube-master-node-1:~$ ping 192.168.122.15
ping: connect: No route to host
ansible@kube-master-node-1:~$ ping 192.168.122.192
ping: connect: No route to host
ansible@kube-master-node-1:~$

jheady commented 1 year ago

Even more oddities for this issue. At the last build of the cluster only one worker node lost connectivity to the other worker and the master. It was still able to access the internet, and the host machine, but not the other guests in the network. The other worker and the master node were able to communicate with each other with no problems. Rebooting the problematic guest had no impact. Nor did using virsh to destroy/start the system. Checking journalctl _SYSTEMD_UNIT=libvirtd.service showed no issues that could lead to the cause.

jheady commented 7 months ago

Potential reason for the network to not work between the clusters is due to the machine UUIDs in /etc/machine-id are identical when cloned. Rebuilding under proxmox, and will be blanking UUIDs prior to cloning the nodes.

From the kubeadm installation guide: Verify the MAC address and product_uuid are unique for every node You can get the MAC address of the network interfaces using the command ip link or ifconfig -a The product_uuid can be checked by using the command sudo cat /sys/class/dmi/id/product_uuid It is very likely that hardware devices will have unique addresses, although some virtual machines may have identical values. Kubernetes uses these values to uniquely identify the nodes in the cluster. If these values are not unique to each node, the installation process may fail.

jheady commented 3 months ago

Unable to resolve issue, and have moved to a new hypervisor system using Proxmox. Closing this issue as won't fix.