Open Uncurlhalo opened 2 months ago
Just an update, if the cluster is created without access_ip
defined everything works as expected.
I have the same problems when mixed using internal ip and real access_ip in cloud environment
kubespray commit 0b64ab1 ( tag v2.24.3) ansible core 2.15.12 python 3.11.3
What happened?
While attempting to stand up a k8s cluster with 3 control-plane nodes and 3 workers, the playbook failed when ensuring that etcd is running.
What did you expect to happen?
The playbook to successfully complete, etcd to be running on the control plane.
How can we reproduce it (as minimally and precisely as possible)?
Install kubespray via
ansible-galaxy
per documentation and create a playbook,install-cluster.yml
with the following contents.Create 6 VM's, provision each VM with one public IP and one private IP (in my use case public meaning public to my LAN and not a host only network on my hypervisor).
Define these IP's as show in my ansible inventory.
Execute the playbook with the command
ansible-playbook -i invetory/invetory.ini --become --become-user=root cluster-install.yml
OS
Linux 6.10.9-200.fc40.x86_64 x86_64 NAME="Fedora Linux" VERSION="40 (Workstation Edition)" ID=fedora VERSION_ID=40 VERSION_CODENAME="" PLATFORM_ID="platform:f40" PRETTY_NAME="Fedora Linux 40 (Workstation Edition)" ANSI_COLOR="0;38;2;60;110;180" LOGO=fedora-logo-icon CPE_NAME="cpe:/o:fedoraproject:fedora:40" DEFAULT_HOSTNAME="fedora" HOME_URL="https://fedoraproject.org/" DOCUMENTATION_URL="https://docs.fedoraproject.org/en-US/fedora/f40/system-administrators-guide/" SUPPORT_URL="https://ask.fedoraproject.org/" BUG_REPORT_URL="https://bugzilla.redhat.com/" REDHAT_BUGZILLA_PRODUCT="Fedora" REDHAT_BUGZILLA_PRODUCT_VERSION=40 REDHAT_SUPPORT_PRODUCT="Fedora" REDHAT_SUPPORT_PRODUCT_VERSION=40 SUPPORT_END=2025-05-13 VARIANT="Workstation Edition" VARIANT_ID=workstation
Version of Ansible
ansible [core 2.16.10] config file = /etc/ansible/ansible.cfg configured module search path = ['/home/jmelton/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules'] ansible python module location = /usr/lib/python3.12/site-packages/ansible ansible collection location = /home/jmelton/.ansible/collections:/usr/share/ansible/collections executable location = /usr/bin/ansible python version = 3.12.5 (main, Aug 23 2024, 00:00:00) [GCC 14.2.1 20240801 (Red Hat 14.2.1-1)] (/usr/bin/python3) jinja version = 3.1.4 libyaml = True
Version of Python
Python 3.12.5
Version of Kubespray (commit)
89ff0710e
Network plugin used
calico
Full inventory with variables
[all] k8s-control-plane-0 ansible_host=192.168.1.200 ansible_user=k8s-node ip=192.168.1.200 access_ip=10.0.1.100 k8s-control-plane-1 ansible_host=192.168.1.201 ansible_user=k8s-node ip=192.168.1.201 access_ip=10.0.1.101 k8s-control-plane-2 ansible_host=192.168.1.202 ansible_user=k8s-node ip=192.168.1.202 access_ip=10.0.1.102 k8s-worker-node-0 ansible_host=192.168.1.210 ansible_user=k8s-node ip=192.168.1.210 access_ip=10.0.1.200 k8s-worker-node-1 ansible_host=192.168.1.211 ansible_user=k8s-node ip=192.168.1.211 access_ip=10.0.1.201 k8s-worker-node-2 ansible_host=192.168.1.212 ansible_user=k8s-node ip=192.168.1.212 access_ip=10.0.1.202
[kube_control_plane] k8s-control-plane-0 k8s-control-plane-1 k8s-control-plane-2
[etcd] k8s-control-plane-0 k8s-control-plane-1 k8s-control-plane-2
[kube_node] k8s-worker-node-0 k8s-worker-node-1 k8s-worker-node-2
[k8s_cluster:children] kube_node kube_control_plane
Command used to invoke ansible
ansible-playbook -i inventory/inventory.ini --become --become-user=root cluster-install.yml
Output of ansible run
Anything else we need to know
A similar issue was opened here but the user never triaged the problem. Additionally relevant information from the control-plane nodes.
Output of
journalctl -xeu etcd.service
The only thing of note seems to be
resolved urls: \"https://10.0.1.100:2380\" != \"https://192.168.1.200:2380\
. I'm going to attempt recreation of the cluster with only the public-ip's defined viaansible-host
, but I'm pretty sure that was working fine the prior day.