kubernetes-sigs / kubespray

Deploy a Production Ready Kubernetes Cluster
Apache License 2.0
16.2k stars 6.49k forks source link

reset.yml presume OS default network manager #11579

Open bjolo opened 1 month ago

bjolo commented 1 month ago

What happened?

We are setting up k8s on bare metal servers running rocky linux 9. But instead of running default NetworkManager, we are running systemd-networkd. Deploy works fine, but reset.yml fails since it is hardcoded to use NetworkManager for redhat derived OS's.

What did you expect to happen?

restart network using the active network mgmt service. Not the one that are default for the OS.

How can we reproduce it (as minimally and precisely as possible)?

  1. deploy k8s on redhat derivate OS that is not using the default networkmanager. for ex systemd-networkd
  2. run reset.yml

OS

Linux 5.14.0-362.8.1.el9_3.x86_64 x86_64 NAME="Rocky Linux" VERSION="9.4 (Blue Onyx)" ID="rocky" ID_LIKE="rhel centos fedora" VERSION_ID="9.4" PLATFORM_ID="platform:el9" PRETTY_NAME="Rocky Linux 9.4 (Blue Onyx)" ANSI_COLOR="0;32" LOGO="fedora-logo-icon" CPE_NAME="cpe:/o:rocky:rocky:9::baseos" HOME_URL="https://rockylinux.org/" BUG_REPORT_URL="https://bugs.rockylinux.org/" SUPPORT_END="2032-05-31" ROCKY_SUPPORT_PRODUCT="Rocky-Linux-9" ROCKY_SUPPORT_PRODUCT_VERSION="9.4" REDHAT_SUPPORT_PRODUCT="Rocky Linux" REDHAT_SUPPORT_PRODUCT_VERSION="9.4"

Version of Ansible

ansible [core 2.16.11] config file = /home/qbjolof/kubespray/ansible.cfg configured module search path = ['/home/qbjolof/kubespray/library'] ansible python module location = /home/qbjolof/.venv3.12-kubespray/lib64/python3.12/site-packages/ansible ansible collection location = /home/qbjolof/.ansible/collections:/usr/share/ansible/collections executable location = /home/qbjolof/.venv3.12-kubespray/bin/ansible python version = 3.12.1 (main, Aug 23 2024, 00:00:00) [GCC 11.4.1 20231218 (Red Hat 11.4.1-3)] (/home/qbjolof/.venv3.12-kubespray/bin/python3.12) jinja version = 3.1.4 libyaml = True

Version of Python

Python 3.12.1

Version of Kubespray (commit)

f9ebd45c7

Network plugin used

calico

Full inventory with variables

n/a

Command used to invoke ansible

ansible-playbook -i inventory/mycluster/inventory.ini --become --user root reset.yml

Output of ansible run

TASK [reset : Reset | Restart network] ** fatal: [eselda07u31.xerces.lan]: FAILED! => {"changed": false, "msg": "Unable to start service NetworkManager: Failed to start NetworkManager.service: Unit NetworkManager.service is masked.\n"} fatal: [eselda07u37.xerces.lan]: FAILED! => {"changed": false, "msg": "Unable to start service NetworkManager: Failed to start NetworkManager.service: Unit NetworkManager.service is masked.\n"} fatal: [eselda08u31.xerces.lan]: FAILED! => {"changed": false, "msg": "Unable to start service NetworkManager: Failed to start NetworkManager.service: Unit NetworkManager.service is masked.\n"} fatal: [eselda08u37.xerces.lan]: FAILED! => {"changed": false, "msg": "Unable to start service NetworkManager: Failed to start NetworkManager.service: Unit NetworkManager.service is masked.\n"}

Anything else we need to know

No response

KubeKyrie commented 1 month ago

Good finding. I am glad to fix it.

KubeKyrie commented 1 month ago

/assign