spurin / diveintokcna

Dive Into Containers, Kubernetes and the Kubernetes Cloud Native Associate Certification
33 stars 11 forks source link

[Lab Issue]: Failed to setup worker node #50

Open mchurichi opened 6 days ago

mchurichi commented 6 days ago

How is the lab being run?

Docker Compose

Operating System

Linux - Fedora

Browser

Firefox

Lab Details

Installing a k3s Kubernetes Cluster

Issue Description

Starting from a brand new lab deployment, I'm having issues to get both worker nodes set up while following the steps.

I was able to successfully configure the control plane node and the first worker node (kubectl get nodes shows both), but when I run the ssh command on the control plane to remotely setup worker-2 it gets stuck on [INFO] systemd: Starting k3s-agent

root@control-plane:~# kubectl get nodes
NAME            STATUS   ROLES                  AGE   VERSION
worker-1        Ready    <none>                 6s    v1.28.4+k3s1
control-plane   Ready    control-plane,master   19s   v1.28.4+k3s1
root@control-plane:~# ssh worker-2 'curl -sfLk https://get.k3s.io | INSTALL_K3S_VERSION=v1.28.4+k3s1 K3S_URL=https://control-plane:6443 K3S_TOKEN=KCNA INSTALL_K3S_EXEC="--kubelet-arg=eviction-hard=imagefs.available<1%,nodefs.available<1%" sh -'
[INFO]  Using v1.28.4+k3s1 as release
[INFO]  Downloading hash https://github.com/k3s-io/k3s/releases/download/v1.28.4+k3s1/sha256sum-amd64.txt
[INFO]  Downloading binary https://github.com/k3s-io/k3s/releases/download/v1.28.4+k3s1/k3s
[INFO]  Verifying binary download
[INFO]  Installing k3s to /usr/local/bin/k3s
[INFO]  Skipping installation of SELinux RPM
[INFO]  Creating /usr/local/bin/kubectl symlink to k3s
[INFO]  Creating /usr/local/bin/crictl symlink to k3s
[INFO]  Creating /usr/local/bin/ctr symlink to k3s
[INFO]  Creating killall script /usr/local/bin/k3s-killall.sh
[INFO]  Creating uninstall script /usr/local/bin/k3s-agent-uninstall.sh
[INFO]  env: Creating environment file /etc/systemd/system/k3s-agent.service.env
[INFO]  systemd: Creating service file /etc/systemd/system/k3s-agent.service
[INFO]  systemd: Enabling k3s-agent unit
Created symlink /etc/systemd/system/multi-user.target.wants/k3s-agent.service → /etc/systemd/system/k3s-agent.service.
[INFO]  Host iptables-save/iptables-restore tools not found
[INFO]  Host ip6tables-save/ip6tables-restore tools not found
[INFO]  systemd: Starting k3s-agent

After waiting for a while, I've tried to CTRL+C, "Sync the lesson environment" and try again from scratch, but the error remains.

Also, I've tried to sync on the next lab step (Kubernetes Pods) with the same result:

...
PLAY [Setup K3S (workers)] **************************************************************************************************************************************************************************************************************************************************************

TASK [Setup K3S] ************************************************************************************************************************************************************************************************************************************************************************
fatal: [worker-2]: FAILED! => {"changed": true, "cmd": "curl -sfLk https://get.k3s.io | INSTALL_K3S_VERSION=v1.28.4+k3s1 K3S_URL=https://control-plane:6443 K3S_TOKEN=KCNA INSTALL_K3S_EXEC=\"--kubelet-arg=eviction-hard=imagefs.available<1%,nodefs.available<1%\" sh -", "delta": "0:00:08.392899", "end": "2024-09-11 09:50:46.683163", "msg": "non-zero return code", "rc": 1, "start": "2024-09-11 09:50:38.290264", "stderr": "Created symlink /etc/systemd/system/multi-user.target.wants/k3s-agent.service → /etc/systemd/system/k3s-agent.service.\nJob for k3s-agent.service failed because the control process exited with error code.\nSee \"systemctl status k3s-agent.service\" and \"journalctl -xeu k3s-agent.service\" for details.", "stderr_lines": ["Created symlink /etc/systemd/system/multi-user.target.wants/k3s-agent.service → /etc/systemd/system/k3s-agent.service.", "Job for k3s-agent.service failed because the control process exited with error code.", "See \"systemctl status k3s-agent.service\" and \"journalctl -xeu k3s-agent.service\" for details."], "stdout": "[INFO]  Using v1.28.4+k3s1 as release\n[INFO]  Downloading hash https://github.com/k3s-io/k3s/releases/download/v1.28.4+k3s1/sha256sum-amd64.txt\n[INFO]  Downloading binary https://github.com/k3s-io/k3s/releases/download/v1.28.4+k3s1/k3s\n[INFO]  Verifying binary download\n[INFO]  Installing k3s to /usr/local/bin/k3s\n[INFO]  Skipping installation of SELinux RPM\n[INFO]  Creating /usr/local/bin/kubectl symlink to k3s\n[INFO]  Creating /usr/local/bin/crictl symlink to k3s\n[INFO]  Creating /usr/local/bin/ctr symlink to k3s\n[INFO]  Creating killall script /usr/local/bin/k3s-killall.sh\n[INFO]  Creating uninstall script /usr/local/bin/k3s-agent-uninstall.sh\n[INFO]  env: Creating environment file /etc/systemd/system/k3s-agent.service.env\n[INFO]  systemd: Creating service file /etc/systemd/system/k3s-agent.service\n[INFO]  systemd: Enabling k3s-agent unit\n[INFO]  Host iptables-save/iptables-restore tools not found\n[INFO]  Host ip6tables-save/ip6tables-restore tools not found\n[INFO]  systemd: Starting k3s-agent", "stdout_lines": ["[INFO]  Using v1.28.4+k3s1 as release", "[INFO]  Downloading hash https://github.com/k3s-io/k3s/releases/download/v1.28.4+k3s1/sha256sum-amd64.txt", "[INFO]  Downloading binary https://github.com/k3s-io/k3s/releases/download/v1.28.4+k3s1/k3s", "[INFO]  Verifying binary download", "[INFO]  Installing k3s to /usr/local/bin/k3s", "[INFO]  Skipping installation of SELinux RPM", "[INFO]  Creating /usr/local/bin/kubectl symlink to k3s", "[INFO]  Creating /usr/local/bin/crictl symlink to k3s", "[INFO]  Creating /usr/local/bin/ctr symlink to k3s", "[INFO]  Creating killall script /usr/local/bin/k3s-killall.sh", "[INFO]  Creating uninstall script /usr/local/bin/k3s-agent-uninstall.sh", "[INFO]  env: Creating environment file /etc/systemd/system/k3s-agent.service.env", "[INFO]  systemd: Creating service file /etc/systemd/system/k3s-agent.service", "[INFO]  systemd: Enabling k3s-agent unit", "[INFO]  Host iptables-save/iptables-restore tools not found", "[INFO]  Host ip6tables-save/ip6tables-restore tools not found", "[INFO]  systemd: Starting k3s-agent"]}
changed: [worker-1]

PLAY [Check K3S] ************************************************************************************************************************************************************************************************************************************************************************

TASK [Check if the control-plane and worker nodes are ready] ****************************************************************************************************************************************************************************************************************************
changed: [control-plane] => (item=control-plane)
changed: [control-plane] => (item=worker-1)
changed: [control-plane] => (item=worker-2)

TASK [Check that expected Kubernetes resources are running] *****************************************************************************************************************************************************************************************************************************
changed: [control-plane] => (item=kubectl -n kube-system wait deployment.apps/local-path-provisioner --for condition=Available=True --timeout=5s)
changed: [control-plane] => (item=kubectl -n kube-system wait deployment.apps/coredns --for condition=Available=True --timeout=5s)
changed: [control-plane] => (item=kubectl -n kube-system wait deployment.apps/metrics-server --for condition=Available=True --timeout=5s)
changed: [control-plane] => (item=kubectl -n kube-system get service/kube-dns)
changed: [control-plane] => (item=kubectl -n kube-system get service/metrics-server)
changed: [control-plane] => (item=kubectl get service/kubernetes)

PLAY [Remove any artifacts] *************************************************************************************************************************************************************************************************************************************************************

TASK [Delete specified Kubernetes resources if they exist] ******************************************************************************************************************************************************************************************************************************
ok: [control-plane]

TASK [Run commands] *********************************************************************************************************************************************************************************************************************************************************************
changed: [control-plane] => (item=kubectl config set-context --current --namespace=default)
ok: [control-plane] => (item=kill $(cat /var/run/kubectl-proxy.pid))
ok: [control-plane] => (item=kubectl uncordon control-plane worker-1 worker-2)

TASK [Delete files and directories] *****************************************************************************************************************************************************************************************************************************************************
changed: [control-plane]

PLAY RECAP ******************************************************************************************************************************************************************************************************************************************************************************
control-plane              : ok=12   changed=7    unreachable=0    failed=0    skipped=1    rescued=0    ignored=0   
worker-1                   : ok=4    changed=1    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0   
worker-2                   : ok=3    changed=0    unreachable=0    failed=1    skipped=0    rescued=0    ignored=0

Additional Feedback

No response

spurin commented 6 days ago

Thanks Maximiliano, how unusual, are you able to ping/ssh worker-2 from the control-plane.Alternatively, are you able to open a tab using the UI to that instance.On 11 Sep 2024, at 10:58, Maximiliano Churichi @.> wrote: How is the lab being run? Docker Compose Operating System Linux - Fedora Browser Firefox Lab Details Installing a k3s Kubernetes Cluster Issue Description Starting from a brand new lab deployment, I'm having issues to get both worker nodes set up while following the steps. I was able to successfully configure the control plane node and the first worker node (kubectl get nodes shows both), but when I run the ssh command on the control plane to remotely setup worker-2 it gets stuck on [INFO] systemd: Starting k3s-agent @.:~# kubectl get nodes NAME STATUS ROLES AGE VERSION worker-1 Ready 6s v1.28.4+k3s1 control-plane Ready control-plane,master 19s v1.28.4+k3s1 @.*:~# ssh worker-2 'curl -sfLk https://get.k3s.io | INSTALL_K3S_VERSION=v1.28.4+k3s1 K3S_URL=https://control-plane:6443 K3S_TOKEN=KCNA INSTALL_K3S_EXEC="--kubelet-arg=eviction-hard=imagefs.available<1%,nodefs.available<1%" sh -' [INFO] Using v1.28.4+k3s1 as release [INFO] Downloading hash https://github.com/k3s-io/k3s/releases/download/v1.28.4+k3s1/sha256sum-amd64.txt [INFO] Downloading binary https://github.com/k3s-io/k3s/releases/download/v1.28.4+k3s1/k3s [INFO] Verifying binary download [INFO] Installing k3s to /usr/local/bin/k3s [INFO] Skipping installation of SELinux RPM [INFO] Creating /usr/local/bin/kubectl symlink to k3s [INFO] Creating /usr/local/bin/crictl symlink to k3s [INFO] Creating /usr/local/bin/ctr symlink to k3s [INFO] Creating killall script /usr/local/bin/k3s-killall.sh [INFO] Creating uninstall script /usr/local/bin/k3s-agent-uninstall.sh [INFO] env: Creating environment file /etc/systemd/system/k3s-agent.service.env [INFO] systemd: Creating service file /etc/systemd/system/k3s-agent.service [INFO] systemd: Enabling k3s-agent unit Created symlink /etc/systemd/system/multi-user.target.wants/k3s-agent.service → /etc/systemd/system/k3s-agent.service. [INFO] Host iptables-save/iptables-restore tools not found [INFO] Host ip6tables-save/ip6tables-restore tools not found [INFO] systemd: Starting k3s-agent After waiting for a while, I've tried to CTRL+C, "Sync the lesson environment" and try again from scratch, but the error remains. Also, I've tried to sync on the next lab step (Kubernetes Pods) with the same result: ... PLAY [Setup K3S (workers)] ****

TASK [Setup K3S] **** fatal: [worker-2]: FAILED! => {"changed": true, "cmd": "curl -sfLk https://get.k3s.io | INSTALL_K3S_VERSION=v1.28.4+k3s1 K3S_URL=https://control-plane:6443 K3S_TOKEN=KCNA INSTALL_K3S_EXEC=\"--kubelet-arg=eviction-hard=imagefs.available<1%,nodefs.available<1%\" sh -", "delta": "0:00:08.392899", "end": "2024-09-11 09:50:46.683163", "msg": "non-zero return code", "rc": 1, "start": "2024-09-11 09:50:38.290264", "stderr": "Created symlink /etc/systemd/system/multi-user.target.wants/k3s-agent.service → /etc/systemd/system/k3s-agent.service.\nJob for k3s-agent.service failed because the control process exited with error code.\nSee \"systemctl status k3s-agent.service\" and \"journalctl -xeu k3s-agent.service\" for details.", "stderr_lines": ["Created symlink /etc/systemd/system/multi-user.target.wants/k3s-agent.service → /etc/systemd/system/k3s-agent.service.", "Job for k3s-agent.service failed because the control process exited with error code.", "See \"systemctl status k3s-agent.service\" and \"journalctl -xeu k3s-agent.service\" for details."], "stdout": "[INFO] Using v1.28.4+k3s1 as release\n[INFO] Downloading hash https://github.com/k3s-io/k3s/releases/download/v1.28.4+k3s1/sha256sum-amd64.txt\n[INFO] Downloading binary https://github.com/k3s-io/k3s/releases/download/v1.28.4+k3s1/k3s\n[INFO] Verifying binary download\n[INFO] Installing k3s to /usr/local/bin/k3s\n[INFO] Skipping installation of SELinux RPM\n[INFO] Creating /usr/local/bin/kubectl symlink to k3s\n[INFO] Creating /usr/local/bin/crictl symlink to k3s\n[INFO] Creating /usr/local/bin/ctr symlink to k3s\n[INFO] Creating killall script /usr/local/bin/k3s-killall.sh\n[INFO] Creating uninstall script /usr/local/bin/k3s-agent-uninstall.sh\n[INFO] env: Creating environment file /etc/systemd/system/k3s-agent.service.env\n[INFO] systemd: Creating service file /etc/systemd/system/k3s-agent.service\n[INFO] systemd: Enabling k3s-agent unit\n[INFO] Host iptables-save/iptables-restore tools not found\n[INFO] Host ip6tables-save/ip6tables-restore tools not found\n[INFO] systemd: Starting k3s-agent", "stdout_lines": ["[INFO] Using v1.28.4+k3s1 as release", "[INFO] Downloading hash https://github.com/k3s-io/k3s/releases/download/v1.28.4+k3s1/sha256sum-amd64.txt", "[INFO] Downloading binary https://github.com/k3s-io/k3s/releases/download/v1.28.4+k3s1/k3s", "[INFO] Verifying binary download", "[INFO] Installing k3s to /usr/local/bin/k3s", "[INFO] Skipping installation of SELinux RPM", "[INFO] Creating /usr/local/bin/kubectl symlink to k3s", "[INFO] Creating /usr/local/bin/crictl symlink to k3s", "[INFO] Creating /usr/local/bin/ctr symlink to k3s", "[INFO] Creating killall script /usr/local/bin/k3s-killall.sh", "[INFO] Creating uninstall script /usr/local/bin/k3s-agent-uninstall.sh", "[INFO] env: Creating environment file /etc/systemd/system/k3s-agent.service.env", "[INFO] systemd: Creating service file /etc/systemd/system/k3s-agent.service", "[INFO] systemd: Enabling k3s-agent unit", "[INFO] Host iptables-save/iptables-restore tools not found", "[INFO] Host ip6tables-save/ip6tables-restore tools not found", "[INFO] systemd: Starting k3s-agent"]} changed: [worker-1]

PLAY [Check K3S] ****

TASK [Check if the control-plane and worker nodes are ready] **** changed: [control-plane] => (item=control-plane) changed: [control-plane] => (item=worker-1) changed: [control-plane] => (item=worker-2)

TASK [Check that expected Kubernetes resources are running] ***** changed: [control-plane] => (item=kubectl -n kube-system wait deployment.apps/local-path-provisioner --for condition=Available=True --timeout=5s) changed: [control-plane] => (item=kubectl -n kube-system wait deployment.apps/coredns --for condition=Available=True --timeout=5s) changed: [control-plane] => (item=kubectl -n kube-system wait deployment.apps/metrics-server --for condition=Available=True --timeout=5s) changed: [control-plane] => (item=kubectl -n kube-system get service/kube-dns) changed: [control-plane] => (item=kubectl -n kube-system get service/metrics-server) changed: [control-plane] => (item=kubectl get service/kubernetes)

PLAY [Remove any artifacts] *****

TASK [Delete specified Kubernetes resources if they exist] ** ok: [control-plane]

TASK [Run commands] ***** changed: [control-plane] => (item=kubectl config set-context --current --namespace=default) ok: [control-plane] => (item=kill $(cat /var/run/kubectl-proxy.pid)) ok: [control-plane] => (item=kubectl uncordon control-plane worker-1 worker-2)

TASK [Delete files and directories] ***** changed: [control-plane]

PLAY RECAP ** control-plane : ok=12 changed=7 unreachable=0 failed=0 skipped=1 rescued=0 ignored=0
worker-1 : ok=4 changed=1 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
worker-2 : ok=3 changed=0 unreachable=0 failed=1 skipped=0 rescued=0 ignored=0 Additional Feedback No response

—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you are subscribed to this thread.Message ID: @.***>