k0sproject / k0s

k0s - The Zero Friction Kubernetes
https://docs.k0sproject.io
Other
3.12k stars 353 forks source link

kube-proxy is missing modprobe which prevents enabling IPVS (since 1.29.0) #4626

Closed svanharmelen closed 1 week ago

svanharmelen commented 2 weeks ago

Before creating an issue, make sure you've checked the following:

Platform

Linux 5.15.0-92-generic #102-Ubuntu SMP Wed Jan 10 09:37:39 UTC 2024 aarch64 GNU/Linux
PRETTY_NAME="Ubuntu 22.04.3 LTS"
NAME="Ubuntu"
VERSION_ID="22.04"
VERSION="22.04.3 LTS (Jammy Jellyfish)"
VERSION_CODENAME=jammy
ID=ubuntu
ID_LIKE=debian
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
UBUNTU_CODENAME=jammy

Version

1.30.1+k0s

Sysinfo

`k0s sysinfo`
Total memory: 7.7 GiB (pass)
Disk space available for /var/lib/k0s: 51.6 GiB (pass)
Name resolution: localhost: [127.0.0.1] (pass)
Operating system: Linux (pass)
  Linux kernel release: 5.15.0-92-generic (pass)
  Max. file descriptors per process: current: 1048576 / max: 1048576 (pass)
  AppArmor: active (pass)
  Executable in PATH: modprobe: /usr/sbin/modprobe (pass)
  Executable in PATH: mount: /usr/bin/mount (pass)
  Executable in PATH: umount: /usr/bin/umount (pass)
  /proc file system: mounted (0x9fa0) (pass)
  Control Groups: version 2 (pass)
    cgroup controller "cpu": available (is a listed root controller) (pass)
    cgroup controller "cpuacct": available (via cpu in version 2) (pass)
    cgroup controller "cpuset": available (is a listed root controller) (pass)
    cgroup controller "memory": available (is a listed root controller) (pass)
    cgroup controller "devices": available (device filters attachable) (pass)
    cgroup controller "freezer": available (cgroup.freeze exists) (pass)
    cgroup controller "pids": available (is a listed root controller) (pass)
    cgroup controller "hugetlb": available (is a listed root controller) (pass)
    cgroup controller "blkio": available (via io in version 2) (pass)
  CONFIG_CGROUPS: Control Group support: built-in (pass)
    CONFIG_CGROUP_FREEZER: Freezer cgroup subsystem: built-in (pass)
    CONFIG_CGROUP_PIDS: PIDs cgroup subsystem: built-in (pass)
    CONFIG_CGROUP_DEVICE: Device controller for cgroups: built-in (pass)
    CONFIG_CPUSETS: Cpuset support: built-in (pass)
    CONFIG_CGROUP_CPUACCT: Simple CPU accounting cgroup subsystem: built-in (pass)
    CONFIG_MEMCG: Memory Resource Controller for Control Groups: built-in (pass)
    CONFIG_CGROUP_HUGETLB: HugeTLB Resource Controller for Control Groups: built-in (pass)
    CONFIG_CGROUP_SCHED: Group CPU scheduler: built-in (pass)
      CONFIG_FAIR_GROUP_SCHED: Group scheduling for SCHED_OTHER: built-in (pass)
        CONFIG_CFS_BANDWIDTH: CPU bandwidth provisioning for FAIR_GROUP_SCHED: built-in (pass)
    CONFIG_BLK_CGROUP: Block IO controller: built-in (pass)
  CONFIG_NAMESPACES: Namespaces support: built-in (pass)
    CONFIG_UTS_NS: UTS namespace: built-in (pass)
    CONFIG_IPC_NS: IPC namespace: built-in (pass)
    CONFIG_PID_NS: PID namespace: built-in (pass)
    CONFIG_NET_NS: Network namespace: built-in (pass)
  CONFIG_NET: Networking support: built-in (pass)
    CONFIG_INET: TCP/IP networking: built-in (pass)
      CONFIG_IPV6: The IPv6 protocol: built-in (pass)
    CONFIG_NETFILTER: Network packet filtering framework (Netfilter): built-in (pass)
      CONFIG_NETFILTER_ADVANCED: Advanced netfilter configuration: built-in (pass)
      CONFIG_NF_CONNTRACK: Netfilter connection tracking support: module (pass)
      CONFIG_NETFILTER_XTABLES: Netfilter Xtables support: module (pass)
        CONFIG_NETFILTER_XT_TARGET_REDIRECT: REDIRECT target support: module (pass)
        CONFIG_NETFILTER_XT_MATCH_COMMENT: "comment" match support: module (pass)
        CONFIG_NETFILTER_XT_MARK: nfmark target and match support: module (pass)
        CONFIG_NETFILTER_XT_SET: set target and match support: module (pass)
        CONFIG_NETFILTER_XT_TARGET_MASQUERADE: MASQUERADE target support: module (pass)
        CONFIG_NETFILTER_XT_NAT: "SNAT and DNAT" targets support: module (pass)
        CONFIG_NETFILTER_XT_MATCH_ADDRTYPE: "addrtype" address type match support: module (pass)
        CONFIG_NETFILTER_XT_MATCH_CONNTRACK: "conntrack" connection tracking match support: module (pass)
        CONFIG_NETFILTER_XT_MATCH_MULTIPORT: "multiport" Multiple port match support: module (pass)
        CONFIG_NETFILTER_XT_MATCH_RECENT: "recent" match support: module (pass)
        CONFIG_NETFILTER_XT_MATCH_STATISTIC: "statistic" match support: module (pass)
      CONFIG_NETFILTER_NETLINK: module (pass)
      CONFIG_NF_NAT: module (pass)
      CONFIG_IP_SET: IP set support: module (pass)
        CONFIG_IP_SET_HASH_IP: hash:ip set support: module (pass)
        CONFIG_IP_SET_HASH_NET: hash:net set support: module (pass)
      CONFIG_IP_VS: IP virtual server support: module (pass)
        CONFIG_IP_VS_NFCT: Netfilter connection tracking: built-in (pass)
        CONFIG_IP_VS_SH: Source hashing scheduling: module (pass)
        CONFIG_IP_VS_RR: Round-robin scheduling: module (pass)
        CONFIG_IP_VS_WRR: Weighted round-robin scheduling: module (pass)
      CONFIG_NF_CONNTRACK_IPV4: IPv4 connetion tracking support (required for NAT): unknown (warning)
      CONFIG_NF_REJECT_IPV4: IPv4 packet rejection: module (pass)
      CONFIG_NF_NAT_IPV4: IPv4 NAT: unknown (warning)
      CONFIG_IP_NF_IPTABLES: IP tables support: module (pass)
        CONFIG_IP_NF_FILTER: Packet filtering: module (pass)
          CONFIG_IP_NF_TARGET_REJECT: REJECT target support: module (pass)
        CONFIG_IP_NF_NAT: iptables NAT support: module (pass)
        CONFIG_IP_NF_MANGLE: Packet mangling: module (pass)
      CONFIG_NF_DEFRAG_IPV4: module (pass)
      CONFIG_NF_CONNTRACK_IPV6: IPv6 connetion tracking support (required for NAT): unknown (warning)
      CONFIG_NF_NAT_IPV6: IPv6 NAT: unknown (warning)
      CONFIG_IP6_NF_IPTABLES: IP6 tables support: module (pass)
        CONFIG_IP6_NF_FILTER: Packet filtering: module (pass)
        CONFIG_IP6_NF_MANGLE: Packet mangling: module (pass)
        CONFIG_IP6_NF_NAT: ip6tables NAT support: module (pass)
      CONFIG_NF_DEFRAG_IPV6: module (pass)
    CONFIG_BRIDGE: 802.1d Ethernet Bridging: module (pass)
      CONFIG_LLC: module (pass)
      CONFIG_STP: module (pass)
  CONFIG_EXT4_FS: The Extended 4 (ext4) filesystem: built-in (pass)
  CONFIG_PROC_FS: /proc file system support: built-in (pass)

What happened?

We tried to update from 1.28.x to the latest 1.30.x but the cluster failed to start after using the new binary (together with the k0s-airgap-bundle-v1.30.1+k0s.0-arm64 bundle). When looking closer at the logs we noticed this in the kube-proxy logs:

│ I0613 17:00:50.551997       1 server.go:511] "Using lenient decoding as strict decoding failed" err="strict decoding error: unknown field \"udpIdleTimeout\""                                                                                                                                                                                                                                                                                                                               │
│ I0613 17:00:50.556647       1 server.go:1062] "Successfully retrieved node IP(s)" IPs=["10.211.55.76"]                                                                                                                                                                                                                                                                                                                                                                                      │
│ I0613 17:00:50.561351       1 server.go:659] "kube-proxy running in dual-stack mode" primary ipFamily="IPv4"                                                                                                                                                                                                                                                                                                                                                                                │
│ time="2024-06-13T17:00:50Z" level=warning msg="Running modprobe ip_vs failed with message: ``, error: exec: \"modprobe\": executable file not found in $PATH"                                                                                                                                                                                                                                                                                                                               │
│ time="2024-06-13T17:00:50Z" level=error msg="Could not get ipvs family information from the kernel. It is possible that ipvs is not enabled in your kernel. Native loadbalancing will not work until this is fixed."                                                                                                                                                                                                                                                                        │
│ I0613 17:00:50.563267       1 proxier.go:646] "Dummy VS not created" scheduler="rr"                                                                                                                                                                                                                                                                                                                                                                                                         │
│ E0613 17:00:50.563279       1 server.go:558] "Error running ProxyServer" err="can't use the IPVS proxier: Ipvs not supported"                                                                                                                                                                                                                                                                                                                                                               │
│ E0613 17:00:50.563299       1 run.go:74] "command failed" err="can't use the IPVS proxier: Ipvs not supported" 

Steps to reproduce

  1. Copy/paste the config below
  2. Start the k0s cluster using: k0s controller --enable-worker --no-taints --config k0s.yaml
apiVersion: k0s.k0sproject.io/v1beta1
kind: ClusterConfig
metadata:
  name: test-server
spec:
  api:
    port: 6443
  images:
    default_pull_policy: Never
  network:
    kubeProxy:
      mode: ipvs
    podCIDR: 10.244.0.0/16
    serviceCIDR: 10.96.0.0/12
  storage:
    type: kine
    kine:
      dataSource: sqlite:////var/test-server/state/kine.db?mode=rwc&_journal=WAL&cache=shared

Expected behavior

A working cluster

Actual behavior

Cluster fails to start (see error output above) and prevents networking to start.

Screenshots and logs

│ I0613 17:00:50.551997       1 server.go:511] "Using lenient decoding as strict decoding failed" err="strict decoding error: unknown field \"udpIdleTimeout\""                                                                                                                                                                                                                                                                                                                               │
│ I0613 17:00:50.556647       1 server.go:1062] "Successfully retrieved node IP(s)" IPs=["10.211.55.76"]                                                                                                                                                                                                                                                                                                                                                                                      │
│ I0613 17:00:50.561351       1 server.go:659] "kube-proxy running in dual-stack mode" primary ipFamily="IPv4"                                                                                                                                                                                                                                                                                                                                                                                │
│ time="2024-06-13T17:00:50Z" level=warning msg="Running modprobe ip_vs failed with message: ``, error: exec: \"modprobe\": executable file not found in $PATH"                                                                                                                                                                                                                                                                                                                               │
│ time="2024-06-13T17:00:50Z" level=error msg="Could not get ipvs family information from the kernel. It is possible that ipvs is not enabled in your kernel. Native loadbalancing will not work until this is fixed."                                                                                                                                                                                                                                                                        │
│ I0613 17:00:50.563267       1 proxier.go:646] "Dummy VS not created" scheduler="rr"                                                                                                                                                                                                                                                                                                                                                                                                         │
│ E0613 17:00:50.563279       1 server.go:558] "Error running ProxyServer" err="can't use the IPVS proxier: Ipvs not supported"                                                                                                                                                                                                                                                                                                                                                               │
│ E0613 17:00:50.563299       1 run.go:74] "command failed" err="can't use the IPVS proxier: Ipvs not supported" 

Additional context

No response

twz123 commented 2 weeks ago

Hi @svanharmelen! Thanks for pointing that out. Indeed, k0s's kube-proxy image is missing all the symlinks to kmod :-( That will be fixed ASAP.

svanharmelen commented 2 weeks ago

Cool, then we will wait for the fixed version to be released!

We did search for a repo that holds the build files and/or Dockerfile for the kube-proxy container to see if we could create a PR or make a custom build ourselves, but we failed to find such repo.

Did we miss it or is that all stored in a private repo?

Thanks!

twz123 commented 1 week ago

Can you check if the latest point releases for 1.29/1.30 fixed the issue for you?

Did we miss it or is that all stored in a private repo?

Indeed, they currently are. It's planned to open them up soonish.

svanharmelen commented 1 week ago

Yup, 1.30.2 seems to work as expected again. Thanks for the help @twz123 👍🏻