vmware / photon

Minimal Linux container host
https://vmware.github.io/photon
Other
3.03k stars 697 forks source link

k8s apiserver not running on RPi 4 model B #1558

Open Ignat99 opened 3 months ago

Ignat99 commented 3 months ago

Describe the bug

I have VMware ESXi 7.0.0 over RPi 4 mod. B with Photon OS 5.0 GA Binaries OVA with virtual hardware v14 arm64 Linux photon-ova 6.1.94-1.ph5-esx # 1-photon SMP Sat Jun 29 03:04:00 UTC 2024 aarch64 GNU/Linux.

But systemctl restart kube-apiserver not sturted apiserver

And i try use Photon OS 4.0 Rev2 OVA with virtual hardware v13 arm64 with the same errors

Reproduction steps

  1. Use docs https://vmware.github.io/photon/docs-v5/user-guide/kubernetes-on-photon-os/running-kubernetes-on-photon-os/prepare-the-hosts/
  2. tdnf install kubernetes
  3. systemctl restart kube-apiserver
  4. Have next message

'Job for kube-apiserver.service failed because the control process exited with error code. See "systemctl status kube-apiserver.service" and "journalctl -xeu kube-apiserver.service" for details.'

`systemctl status kube-apiserver.service × kube-apiserver.service - Kubernetes API Server Loaded: loaded (/usr/lib/systemd/system/kube-apiserver.service; disabled; preset: enabled) Active: failed (Result: exit-code) since Mon 2024-07-01 08:41:10 UTC; 1min 13s ago Docs: https://github.com/GoogleCloudPlatform/kubernetes Process: 801 ExecStart=/usr/bin/kube-apiserver $KUBE_LOGTOSTDERR $KUBE_LOG_LEVEL $KUBE_ETCD_SERVERS $KUBE_API_ADDRESS $KUBE_API> Main PID: 801 (code=exited, status=1/FAILURE) CPU: 182ms

Jul 01 08:41:10 photon-ova systemd[1]: kube-apiserver.service: Main process exited, code=exited, status=1/FAILURE Jul 01 08:41:10 photon-ova systemd[1]: kube-apiserver.service: Failed with result 'exit-code'. Jul 01 08:41:10 photon-ova systemd[1]: Failed to start Kubernetes API Server. Jul 01 08:41:10 photon-ova systemd[1]: kube-apiserver.service: Scheduled restart job, restart counter is at 5. Jul 01 08:41:10 photon-ova systemd[1]: Stopped Kubernetes API Server. Jul 01 08:41:10 photon-ova systemd[1]: kube-apiserver.service: Start request repeated too quickly. Jul 01 08:41:10 photon-ova systemd[1]: kube-apiserver.service: Failed with result 'exit-code'. Jul 01 08:41:10 photon-ova systemd[1]: Failed to start Kubernetes API Server. `

...

Expected behavior

k8s does not start. Also the documentation is not true in the following files

nano /etc/kubernetes/apiserver

# The address on the local server to listen to.
KUBE_API_ADDRESS="--insecure-bind-address=0.0.0.0"

# Comma separated list of nodes in the etcd cluster
# KUBE_ETCD_SERVERS="--etcd-servers=http://127.0.0.1:2379"
KUBE_ETCD_SERVERS="--etcd-servers=http://127.0.0.1:4001"

As you can see, the name of the API --address flag has changed to --insecure-bind-address, the etcd port number is not correct. Need 2379

Additional context

I found numerous complaints about similar problems with keys on the internet.

kube-apiserver fails init. receive "--service-account-signing-key-file and --service-account-issuer are required flag"

I rebuilt apiserver directly on my RPi device with the same result. So the problem is definitely in the configuration, not the software.

dcasota commented 3 months ago

@Ignat99 Photon OS repository contains the packages kubernetes kubernetes-kubeadm kubernetes-dns kubernetes-metrics-server kubernetes-pause, and yes, unfortunately there are issues with the utility options.

Mixes from options is actually a challenge. The user guide chapters Running Kubernetes on Photon OS and Kubeadm Cluster on Photon OS with configuring a Master Node do not help in any constellation.

Port 8080 is used if no configuration is found/specified. Does kubectl get node show connection refused messages?

Does minikube work?

cd $HOME
curl -Lo minikube https://storage.googleapis.com/minikube/releases/latest/minikube-linux-aarch64
chmod +x minikube
./minikube start -p cnb --force
kubectl get node
Ignat99 commented 3 months ago

Yes, minikube work.

 ignat99@photon-ova [ ~ ]$ ./minikube start -p cnb --force

* [cnb] minikube v1.17.1 on Photon 5.0 (arm64)
! minikube skips various validations when --force is supplied; this may lead to unexpected behavior
* Automatically selected the docker driver
* Starting control plane node cnb in cluster cnb
* Pulling base image ...
* Creating docker container (CPUs=2, Memory=2200MB) ...
* Stopping node "cnb"  ...
* Powering off "cnb" via SSH ...
* Deleting "cnb" in docker ...
! StartHost failed, but will try again: creating host: create host timed out in 360.000000 seconds
* Creating docker container (CPUs=2, Memory=2200MB) ...
* Preparing Kubernetes v1.20.2 on Docker 20.10.2 ...
  - Generating certificates and keys ...
  - Booting up control plane ...
  - Configuring RBAC rules ...
* Verifying Kubernetes components...
* Enabled addons: default-storageclass, storage-provisioner

! /bin/kubectl is version 1.27.13, which may have incompatibilites with Kubernetes 1.20.2.
  - Want kubectl v1.20.2? Try 'minikube kubectl -- get pods -A'
* Done! kubectl is now configured to use "cnb" cluster and "default" namespace by default

ignat99@photon-ova [ ~ ]$  kubectl get node
NAME   STATUS   ROLES                  AGE   VERSION
cnb    Ready    control-plane,master   45s   v1.20.2

I added RAM to the virtual machine and changed the number of CPUs from 1 to 4.

ignat99@photon-ova [ ~ ]$  kubectl get pods -A

NAMESPACE     NAME                          READY   STATUS             RESTARTS   AGE
kube-system   coredns-74ff55c5b-qw72d       0/1     Running            0          7m24s
kube-system   etcd-cnb                      1/1     Running            0          7m29s
kube-system   kube-apiserver-cnb            1/1     Running            0          7m29s
kube-system   kube-controller-manager-cnb   1/1     Running            0          7m29s
kube-system   kube-proxy-4lv2v              0/1     CrashLoopBackOff   6          7m24s
kube-system   kube-scheduler-cnb            1/1     Running            0          7m29s
kube-system   storage-provisioner           0/1     CrashLoopBackOff   5          7m30s

But 3 pods not running...

dcasota commented 3 months ago

Yes, the issue is reproducible on x86_64 as well.

On issue is the flag logtostderr, see sudo journalctl -xeu kube-apiserver.service | grep logtostderr. The flag has been deprecated and removed in v1.26, but the kubernetes v1.27 package still uses that flag. One thinkable workaround until fixed packages are published might be to use override.conf per service.

prashant1221 commented 3 months ago

Thanks for pointing out. Will work on fixes starting week.

dcasota commented 3 months ago

edited July 3rd: @prashant1221 Kubernetes does not start with default configuration. Default in a typical constellation means, the system receives an ipv4 address e.g. on network adapter eth0. But that ipv4 address isn't inserted during installation e.g. in /etc/kubernetes/config. The installation's default is KUBE_MASTER="--master=http://127.0.0.1:8080". There is no sort of network event brokering for Kubernetes.

@prashant1221 btw. from the options above, another issue is the combo option 4+6 (docker-rootless without privileged security context and configured-docker-to-start-on-boot-with-systemd) which does not work properly. Compare Photon OS docs docker-rootless-support, docker start on boot with systemd, specific-subuid-range-for-systemd-homed-users and the correlation to the kernel parameter CONFIG_EXT4_FS_SECURITY. There are a few discussions, see here, here and here, which describe the issue [rootlesskit:parent] error: failed to setup UID/GID map. The issue is reproducible, see ssh rootless-docker output.

I try to clarify this. This is the code snippet I use for docker-rootless.

  1. Login as root.
    
    tdnf install -y shadow fuse slirp4netns libslirp
    tdnf install -y docker-rootless

https://docs.docker.com/engine/install/linux-postinstall/

Manage docker as a non-root user

ROOTLESS_USER="rootless" # change here

subuid start range is prepared for systemd-homed, see https://rootlesscontaine.rs/getting-started/common/subuid/#specific-subuid-range-for-systemd-homed-users

uid="524288"

-f -1 : disable password expiration

-m : home folder

-g docker : primary group

useradd $ROOTLESS_USER -f -1 -m -g docker echo "$ROOTLESS_USER:$uid:65536" >> /etc/subuid echo "$ROOTLESS_USER:$uid:65536" >> /etc/subgid echo "kernel.unprivileged_userns_clone = 1" >> /etc/sysctl.d/50-rootless.conf chmod 644 /etc/subuid /etc/subgid /etc/sysctl.d/50-rootless.conf

reload

sysctl --system modprobe ip_tables

change password of rootless-user

passwd $ROOTLESS_USER

Configure login initialization for rootless-user

cat <<EOFROOTLESS | tee /tmp/rootless.sh echo 'export PATH=/usr/local/bin:$PATH' >> ~/.bash_profile echo 'export DOCKER_HOST=unix:///run/user/$(id -u)/docker.sock' >> ~/.bash_profile EOFROOTLESS chmod 644 /tmp/rootless.sh chmod a+x /tmp/rootless.sh ssh ${ROOTLESS_USER:+${ROOTLESS_USER}@}localhost '/tmp/rootless.sh && exit' rm -f /tmp/rootless.sh

Run precheck as rootless-user

ssh ${ROOTLESS_USER:+${ROOTLESS_USER}@}localhost 'dockerd-rootless-setuptool.sh check --force && exit'

Run installation as rootless-user

ssh ${ROOTLESS_USER:+${ROOTLESS_USER}@}localhost 'dockerd-rootless-setuptool.sh install --force && exit'


2. Add a demo user <>root and <>rootless-user. Exit root.

DEMO_USER="dcasota" useradd $DEMO_USER -f -1 -m -g users -G sudo,wheel passwd $DEMO_USER

exit


3. Login as demo user.

Process test 1.

docker run -it photon date 1>NUL 2>&1 || echo $? 126

Result: It doesn't work without sudo.

Process test 2.

sudo docker run -it photon date 1>NUL 2>&1 || echo $?


Result: In privileged security context, it works flawlessly.
prashant1221 commented 4 weeks ago

Kubeadm documentation should work https://vmware.github.io/photon/docs-v5/user-guide/kubernetes-on-photon-os/kubernetes-kubeadm-cluster-on-photon/

And doing service files configuration like defined here https://github.com/kelseyhightower/kubernetes-the-hard-way/tree/master I was able to start the cluster and run application successfully.

So proper configuration files changes are needed which can start in default configuration