kubernetes-sigs / kubespray

Deploy a Production Ready Kubernetes Cluster
Apache License 2.0
16.2k stars 6.48k forks source link

Ubuntu 24.04 - kubelet exited #11664

Closed hufhend closed 4 weeks ago

hufhend commented 4 weeks ago

What happened?

● kubelet.service - Kubernetes Kubelet Server
     Loaded: loaded (/etc/systemd/system/kubelet.service; enabled; preset: enabled)
    Drop-In: /usr/lib/systemd/system/kubelet.service.d
             └─10-kubeadm.conf
     Active: activating (auto-restart) (Result: exit-code) since Thu 2024-10-24 11:38:15 CEST; 3s ago
       Docs: https://github.com/GoogleCloudPlatform/kubernetes
    Process: 2173 ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_CONFIG_ARGS $KUBELET_KUBEADM_ARGS $KUBELET_EXTRA_ARGS (code=exited, statu>
   Main PID: 2173 (code=exited, status=203/EXEC)
        CPU: 1ms

What did you expect to happen?

I would expect it to run and have the right variables, like here on Ubuntu 22.04:

● kubelet.service - Kubernetes Kubelet Server
     Loaded: loaded (/etc/systemd/system/kubelet.service; enabled; preset: enabled)
    Drop-In: /usr/lib/systemd/system/kubelet.service.d
             └─10-kubeadm.conf
     Active: active (running) since Fri 2024-10-04 00:49:24 CEST; 2 weeks 6 days ago
       Docs: https://github.com/GoogleCloudPlatform/kubernetes
   Main PID: 60898 (kubelet)
      Tasks: 16 (limit: 14113)
     Memory: 79.4M (peak: 762.5M)
        CPU: 1d 7h 11min 52.026s
     CGroup: /system.slice/kubelet.service
             └─60898 /usr/local/bin/kubelet --v=2 --node-ip=192.168.3.105 --hostname-override=n-elb01-dc1 --bootstrap-kubeconfig=/etc/kubernetes/bootstr>

How can we reproduce it (as minimally and precisely as possible)?

For me it only shows up on Ubuntu 24.04 and Armbian Linux v24.11 rolling for Orange Pi Zero3, I assume the problem is related to these distributions.

OS

Linux 6.8.0-47-generic x86_64
PRETTY_NAME="Ubuntu 24.04.1 LTS"
NAME="Ubuntu"
VERSION_ID="24.04"
VERSION="24.04.1 LTS (Noble Numbat)"
VERSION_CODENAME=noble
ID=ubuntu
ID_LIKE=debian
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
UBUNTU_CODENAME=noble
LOGO=ubuntu-logo

and

Linux 6.6.54-current-sunxi64 aarch64
PRETTY_NAME="Armbian 24.11.0-trunk.318 bookworm"
NAME="Debian GNU/Linux"
VERSION_ID="12"
VERSION="12 (bookworm)"
VERSION_CODENAME=bookworm
ID=debian
HOME_URL="https://www.armbian.com"
SUPPORT_URL="https://forum.armbian.com"
BUG_REPORT_URL="https://www.armbian.com/bugs"
ARMBIAN_PRETTY_NAME="Armbian 24.11.0-trunk.318 bookworm"

Version of Ansible

ansible [core 2.16.10]
  config file = /home/hufhendr/git/kubespray/ansible.cfg
  configured module search path = ['/home/hufhendr/git/kubespray/library']
  ansible python module location = /home/hufhendr/git/kubespray-venv/lib/python3.12/site-packages/ansible
  ansible collection location = /home/hufhendr/.ansible/collections:/usr/share/ansible/collections
  executable location = /home/hufhendr/git/kubespray-venv/bin/ansible
  python version = 3.12.3 (main, Sep 11 2024, 14:17:37) [GCC 13.2.0] (/home/hufhendr/git/kubespray-venv/bin/python3)
  jinja version = 3.1.4
  libyaml = True

Version of Python

Python 3.12.3

Version of Kubespray (commit)

4577ee4a5

Network plugin used

calico

Full inventory with variables

n-pro04-dc3 | SUCCESS => { "hostvars[kube_node]": "VARIABLE IS NOT DEFINED!" }

Command used to invoke ansible

ansible-playbook -i inventory/mycluster/hosts.yaml --become --become-user=root cluster.yml -b -l n-pro04-dc3

Output of ansible run

Surprisingly fine and bug-free

Anything else we need to know

It used to load and run normally, now the kubelet crashes right after the kubespray runs out. Something must have changed between the distribution and kubespray, anyway the kubelet doesn't have its variables, although the file with them is there /usr/lib/systemd/system/kubelet.service.d/10-kubeadm.conf

VannTen commented 4 weeks ago

That drop-in looks out of place. Kubespray does not use distro systemd unit files AFAIK, so this one might be interfering with the kubespray settings in unexpected ways.

hufhend commented 4 weeks ago

Maybe he's not using a distro, but creating his own. Either way, he's using them, here I see in kubespray:

- name: Enable kubelet
  service:
    name: kubelet
    enabled: true
    state: started
VannTen commented 4 weeks ago

That's not the point. The systemd service is probably from kubespray, since it's in /etc as your logs show. But the override is in /usr/lib which makes me think it's part of a distro package. The combination of those two might not be something working.

Anyway, systemctl cat kubelet and journalctl -u kubelet might help you to diagnose the exact problem.

hufhend commented 4 weeks ago

I checked it, it's exactly the same as on Ubuntu 22.04 where it works:

# /etc/systemd/system/kubelet.service
[Unit]
Description=Kubernetes Kubelet Server
Documentation=https://github.com/GoogleCloudPlatform/kubernetes
After=containerd.service
Wants=containerd.service

[Service]
EnvironmentFile=-/etc/kubernetes/kubelet.env
ExecStart=/usr/local/bin/kubelet \
                $KUBE_LOGTOSTDERR \
                $KUBE_LOG_LEVEL \
                $KUBELET_API_SERVER \
                $KUBELET_ADDRESS \
                $KUBELET_PORT \
                $KUBELET_HOSTNAME \
                $KUBELET_ARGS \
                $DOCKER_SOCKET \
                $KUBELET_NETWORK_PLUGIN \
                $KUBELET_VOLUME_PLUGIN \
                $KUBELET_CLOUDPROVIDER
Restart=always
RestartSec=10s

[Install]
WantedBy=multi-user.target

# /usr/lib/systemd/system/kubelet.service.d/10-kubeadm.conf
# Note: This dropin only works with kubeadm and kubelet v1.11+
[Service]
Environment="KUBELET_KUBECONFIG_ARGS=--bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf"
Environment="KUBELET_CONFIG_ARGS=--config=/var/lib/kubelet/config.yaml"
# This is a file that "kubeadm init" and "kubeadm join" generates at runtime, populating the KUBELET_KUBEADM_ARGS variable dynamically
EnvironmentFile=-/var/lib/kubelet/kubeadm-flags.env
# This is a file that the user can use for overrides of the kubelet args as a last resort. Preferably, the user should use
# the .NodeRegistration.KubeletExtraArgs object in the configuration files instead. KUBELET_EXTRA_ARGS should be sourced from this file.
EnvironmentFile=-/etc/default/kubelet
ExecStart=
ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_CONFIG_ARGS $KUBELET_KUBEADM_ARGS $KUBELET_EXTRA_ARGS
VannTen commented 4 weeks ago

ExecStart= ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_CONFIG_ARGS $KUBELET_KUBEADM_ARGS $KUBELET_EXTRA_ARGS

As I was saying. This use kubelet from the distro package instead of the one provided by kubespray.

Mixing configuration from kubespray and other sources for the kubelet is not supported, and won't be. Try removing the packages owning that override and see if the issue persist.

If it does, then you should open a new bug report

/kind support /close

k8s-ci-robot commented 4 weeks ago

@VannTen: Closing this issue.

In response to [this](https://github.com/kubernetes-sigs/kubespray/issues/11664#issuecomment-2436110364): >> ExecStart= >> ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_CONFIG_ARGS $KUBELET_KUBEADM_ARGS $KUBELET_EXTRA_ARGS > >As I was saying. This use kubelet from the distro package instead of the one provided by kubespray. > >Mixing configuration from kubespray and other sources for the kubelet is not supported, and won't be. >Try removing the packages owning that override and see if the issue persist. > >If it does, then you should open a new bug report > >/kind support >/close > Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes-sigs/prow](https://github.com/kubernetes-sigs/prow/issues/new?title=Prow%20issue:) repository.
hufhend commented 4 weeks ago

No, there is no distribution version installed.

apt-cache policy kubelet
kubelet:
  Installed: (none)
  Candidate: 1.30.6-1.1
  Version table:
     1.30.6-1.1 500
        500 https://pkgs.k8s.io/core:/stable:/v1.30/deb  Packages
     1.30.5-1.1 500
        500 https://pkgs.k8s.io/core:/stable:/v1.30/deb  Packages
     1.30.4-1.1 500
        500 https://pkgs.k8s.io/core:/stable:/v1.30/deb  Packages
     1.30.3-1.1 500
        500 https://pkgs.k8s.io/core:/stable:/v1.30/deb  Packages
     1.30.2-1.1 500
        500 https://pkgs.k8s.io/core:/stable:/v1.30/deb  Packages
     1.30.1-1.1 500
        500 https://pkgs.k8s.io/core:/stable:/v1.30/deb  Packages
     1.30.0-1.1 500
        500 https://pkgs.k8s.io/core:/stable:/v1.30/deb  Packages
hufhend commented 3 weeks ago

This helped me 1ca063b0ae755f159588d31006f4e9b5a9fc1196