kubernetes-sigs / kubespray

Deploy a Production Ready Kubernetes Cluster
Apache License 2.0
15.7k stars 6.38k forks source link

Kubespray fails on Rocky Linux 9 when running on raspberry pi #11163

Closed franznemeth closed 2 months ago

franznemeth commented 2 months ago

What happened?

I'm trying to install Kubespray in my homelab. The masternode and a couple of worker nodes are x86 but one of the worker nodes is a Raspberry Pi4 running Rocky Linux.

The Rocky Linux image for Raspberry PI does not have CPU and memory cgroups enabled by default which makes starting container images impossible.

Once I have added cgroup_enable=cpuset cgroup_memory=1 cgroup_enable=memory to /boot/cmdline.txt the Raspberry Pi can be used as a worker node.

What did you expect to happen?

I would expect Kubespray to check if the necessary cgroups are enabled before proceeding with the installation of containerd

How can we reproduce it (as minimally and precisely as possible)?

Trying to install kubespray with a raspberry pi as a worker node.

OS

inux 5.14.0-362.24.1.el9_3.0.1.aarch64 aarch64 NAME="Rocky Linux" VERSION="9.3 (Blue Onyx)" ID="rocky" ID_LIKE="rhel centos fedora" VERSION_ID="9.3" PLATFORM_ID="platform:el9" PRETTY_NAME="Rocky Linux 9.3 (Blue Onyx)" ANSI_COLOR="0;32" LOGO="fedora-logo-icon" CPE_NAME="cpe:/o:rocky:rocky:9::baseos" HOME_URL="https://rockylinux.org/" BUG_REPORT_URL="https://bugs.rockylinux.org/" SUPPORT_END="2032-05-31" ROCKY_SUPPORT_PRODUCT="Rocky-Linux-9" ROCKY_SUPPORT_PRODUCT_VERSION="9.3" REDHAT_SUPPORT_PRODUCT="Rocky Linux" REDHAT_SUPPORT_PRODUCT_VERSION="9.3"

Version of Ansible

ansible [core 2.16.6] config file = None configured module search path = ['/home/franz.nemeth/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules'] ansible python module location = /home/franz.nemeth/.local/lib/python3.12/site-packages/ansible ansible collection location = /home/franz.nemeth/.ansible/collections:/usr/share/ansible/collections executable location = /home/franz.nemeth/.local/bin/ansible python version = 3.12.2 (main, Feb 21 2024, 00:00:00) [GCC 13.2.1 20231205 (Red Hat 13.2.1-6)] (/usr/bin/python3) jinja version = 3.1.2 libyaml = True

Version of Python

Python 3.12.2

Version of Kubespray (commit)

bc6bd21ab

Network plugin used

custom_cni

Full inventory with variables


container_manager: containerd
etcd_deployment_type: kubeadm
# Cilium helm installation
kube_network_plugin: custom_cni
# This must be set to root otherwise cilium agent init fails with permission denied while copying to /opt/cni/bin
cni_bin_owner: root
custom_cni_chart_namespace: kube-system
custom_cni_chart_release_name: cilium
custom_cni_chart_repository_name: cilium
custom_cni_chart_repository_url: https://helm.cilium.io
custom_cni_chart_ref: cilium/cilium
custom_cni_chart_version: 1.15.2
custom_cni_chart_values:
  annotateK8sNode: true
  dnsPolicy: "ClusterFirstWithHostNet"
  kubeProxyReplacement: "true"
  k8sServiceHost: "10.11.12.13"
  k8sServicePort: "6443"
  nodeinit:
    enabled: true
  ipam:
    mode: "kubernetes"
    operator:
      clusterPoolIPv4PodCIDRList: ["{{ kube_pods_subnet }}"]

kube_version: v1.29.2

# Cluster Loglevel configuration
kube_log_level: 2

# Kubernetes internal network for services, unused block of space.
kube_service_addresses: 10.43.0.0/16
kube_pods_subnet: 10.42.0.0/16
---
# inventory.yaml
all:
  # Ansible connection settings.
  vars:
    # Explicitly set the connection type to the node servers.
    ansible_connection: ssh
  hosts:
    master:
      ansible_host: 10.11.12.13
    rpi:
      ansible_host: 10.11.12.14
  children:
    kube_control_plane:
      hosts:
        master:
    kube_node:
      hosts:
        master:
        rpi:
    etcd:
      hosts:
        master:
    k8s_cluster:
      children:
        kube_control_plane:
        kube_node:

Command used to invoke ansible

cd kubespray && ansible-playbook -i ../inventory ./cluster.yml -b

Output of ansible run

Sadly I don't have the output right now, if you decide that this functionality should be added I'm happy to get the output.

Anything else we need to know

If you decide this is out of scope for Kubespray to fix that's fine, otherwise I'm happy to submit a PR to fix this issue.

VannTen commented 2 months ago

Checking if cgroups are enabled seems reasonable to me. That could go in verify-settings (roles/kubernetes/preinstall)