Closed Miguelerja closed 3 years ago
I would probably taint the server to ensure that it doesn't end up with any pods running on it. Either that or make one of the 4Bs the master. The 3s are capable of running k3s plus a small number of pods, but due to slow SD card IO will frequently struggle to keep up with datastore operations while also pulling and running images at the same time. If you do want to run it on the 3, besides tainting it, you should also ensure you're using a high-speed SD card, or external (USB) storage.
I'm seeing very similar behaviour, but my problems start from installation on master node:
pi@rpi3:~ $ free -m
total used free shared buff/cache available
Mem: 962 100 484 12 377 793
Swap: 99 0 99
pi@rpi3:~ $ curl -sfL https://get.k3s.io | sh -
[INFO] Finding release for channel stable
[INFO] Using v1.19.4+k3s1 as release
[INFO] Downloading hash https://github.com/rancher/k3s/releases/download/v1.19.4+k3s1/sha256sum-arm64.txt
[INFO] Downloading binary https://github.com/rancher/k3s/releases/download/v1.19.4+k3s1/k3s-arm64
[INFO] Verifying binary download
[INFO] Installing k3s to /usr/local/bin/k3s
[INFO] Creating /usr/local/bin/kubectl symlink to k3s
[INFO] Creating /usr/local/bin/crictl symlink to k3s
[INFO] Skipping /usr/local/bin/ctr symlink to k3s, command exists in PATH at /usr/bin/ctr
[INFO] Creating killall script /usr/local/bin/k3s-killall.sh
[INFO] Creating uninstall script /usr/local/bin/k3s-uninstall.sh
[INFO] env: Creating environment file /etc/systemd/system/k3s.service.env
[INFO] systemd: Creating service file /etc/systemd/system/k3s.service
[INFO] systemd: Enabling k3s unit
Created symlink /etc/systemd/system/multi-user.target.wants/k3s.service → /etc/systemd/system/k3s.service.
[INFO] systemd: Starting k3s
stuck here for more than 15minutes now
high disk usage and then run out of memory. I can disconnect it from power, it will run ok for less than a minute, then stop responding again. It has to be a new regression and I was able to create a cluster 3-4 months ago on the same rpi and high speed SD card.
curl -sfL https://get.k3s.io | INSTALL_K3S_VERSION=v1.17.14+k3s3 sh -
curl -sfL https://get.k3s.io | INSTALL_K3S_VERSION=v1.18.12+k3s2 sh -
start successfully. it's v1.19 regression
@agilob can you provide logs? journalctl --no-pager -u k3s
Are you running arm64 on a Pi3? As far as I know they don't actually benefit much from arm64 because they have so little memory.
Also, it looks like you have swap enabled. Best practice for Kubernetes is to run without swap, especially on systems where swap is on slow storage like a SD card.
Here's that same release running on my Pi3b, freshly installed just now:
root@pi03:~# curl -sfL https://get.k3s.io | sh -
[INFO] Finding release for channel stable
[INFO] Using v1.19.4+k3s1 as release
[INFO] Downloading hash https://github.com/rancher/k3s/releases/download/v1.19.4+k3s1/sha256sum-arm.txt
[INFO] Downloading binary https://github.com/rancher/k3s/releases/download/v1.19.4+k3s1/k3s-armhf
[INFO] Verifying binary download
[INFO] Installing k3s to /usr/local/bin/k3s
[INFO] Creating /usr/local/bin/kubectl symlink to k3s
[INFO] Creating /usr/local/bin/crictl symlink to k3s
[INFO] Creating /usr/local/bin/ctr symlink to k3s
[INFO] Creating killall script /usr/local/bin/k3s-killall.sh
[INFO] Creating uninstall script /usr/local/bin/k3s-uninstall.sh
[INFO] env: Creating environment file /etc/systemd/system/k3s.service.env
[INFO] systemd: Creating service file /etc/systemd/system/k3s.service
[INFO] systemd: Enabling k3s unit
Created symlink /etc/systemd/system/multi-user.target.wants/k3s.service → /etc/systemd/system/k3s.service.
[INFO] systemd: Starting k3s
root@pi03:~# kubectl get nodes -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
pi03.lan.khaus Ready master 17s v1.19.4+k3s1 10.0.1.25 <none> Ubuntu 20.10 5.8.0-1007-raspi containerd://1.4.1-k3s1
root@pi03:~# free
total used free shared buff/cache available
Mem: 873248 471092 10860 12304 391296 374748
Swap: 0 0 0
root@pi03:~# uname -a
Linux pi03.lan.khaus 5.8.0-1007-raspi #10-Ubuntu SMP PREEMPT Thu Nov 5 18:01:40 UTC 2020 armv7l armv7l armv7l GNU/Linux
@agilob can you provide logs? journalctl --no-pager -u k3s
well, not really, it takes 3-5 seconds after k3s-server is started for system to become unresponsive, so not much time to execute the command
Are you running arm64 on a Pi3?
Yes. Raspbian 64bits.
As far as I know they don't actually benefit much from arm64 because they have so little memory.
I run it just because I can, not for performance or anything.
Also, it looks like you have swap enabled. Best practice for Kubernetes is to run without swap, especially on systems where swap is on slow storage like a SD card.
freshly installed just now:
I tried also on 64bits Ubuntu with much slower SD card and same effect.
Nevertheless, I installed k3s just for testing. All my attempts (since 2 years ago, 0 success) to try use k3s at home fail tragically with random tls errors. k3s never starts correctly, crashes often and even when starts and runs slave nodes have problems (re)connecting due to many tls errors like #1884 or many others so I've already uninstalled k3s from all my devices.
Dec 08 18:33:54 rpi3 k3s[3072]: E1208 18:33:54.750766 3072 kubelet.go:2183] node "rpi3" not found
Dec 08 18:33:54 rpi3 k3s[3072]: E1208 18:33:54.851642 3072 kubelet.go:2183] node "rpi3" not found
Dec 08 18:33:54 rpi3 k3s[3072]: E1208 18:33:54.952856 3072 kubelet.go:2183] node "rpi3" not found
Dec 08 18:33:55 rpi3 k3s[3072]: E1208 18:33:55.053569 3072 kubelet.go:2183] node "rpi3" not found
Dec 08 18:33:55 rpi3 k3s[3072]: E1208 18:33:55.154264 3072 kubelet.go:2183] node "rpi3" not found
Dec 08 18:33:55 rpi3 k3s[3072]: time="2020-12-08T18:33:55.192865300Z" level=warning msg="Unable to watch for tunnel endpoints: unknown (get endpoints)"
Dec 08 18:33:55 rpi3 k3s[3072]: E1208 18:33:55.255534 3072 kubelet.go:2183] node "rpi3" not found
Dec 08 18:33:55 rpi3 k3s[3072]: E1208 18:33:55.356598 3072 kubelet.go:2183] node "rpi3" not found
Dec 08 18:33:55 rpi3 k3s[3072]: E1208 18:33:55.456957 3072 kubelet.go:2183] node "rpi3" not found
Dec 08 18:33:55 rpi3 k3s[3072]: E1208 18:33:55.559702 3072 kubelet.go:2183] node "rpi3" not found
Dec 08 18:33:55 rpi3 k3s[3072]: E1208 18:33:55.662447 3072 kubelet.go:2183] node "rpi3" not found
Dec 08 18:33:55 rpi3 k3s[3072]: E1208 18:33:55.762740 3072 kubelet.go:2183] node "rpi3" not found
Dec 08 18:33:55 rpi3 k3s[3072]: E1208 18:33:55.863125 3072 kubelet.go:2183] node "rpi3" not found
Dec 08 18:33:55 rpi3 k3s[3072]: E1208 18:33:55.964129 3072 kubelet.go:2183] node "rpi3" not found
Dec 08 18:33:55 rpi3 k3s[3072]: time="2020-12-08T18:33:55.969066142Z" level=info msg="Waiting for cloudcontroller rbac role to be created"
Dec 08 18:33:56 rpi3 k3s[3072]: E1208 18:33:56.065040 3072 kubelet.go:2183] node "rpi3" not found
Dec 08 18:33:56 rpi3 k3s[3072]: E1208 18:33:56.165457 3072 kubelet.go:2183] node "rpi3" not found
Dec 08 18:33:56 rpi3 k3s[3072]: E1208 18:33:56.229118 3072 node.go:125] Failed to retrieve node info: nodes "rpi3" is forbidden: User "system:kube-proxy" cannot get resource "nodes" in API group "" at the cluster scope
Dec 08 18:33:56 rpi3 k3s[3072]: E1208 18:33:56.266531 3072 kubelet.go:2183] node "rpi3" not found
Dec 08 18:33:56 rpi3 k3s[3072]: E1208 18:33:56.368506 3072 kubelet.go:2183] node "rpi3" not found
Dec 08 18:33:56 rpi3 k3s[3072]: E1208 18:33:56.469936 3072 kubelet.go:2183] node "rpi3" not found
Dec 08 18:33:56 rpi3 k3s[3072]: time="2020-12-08T18:33:56.529980809Z" level=info msg="Waiting for node rpi3: nodes \"rpi3\" not found"
Dec 08 18:33:56 rpi3 k3s[3072]: E1208 18:33:56.571054 3072 kubelet.go:2183] node "rpi3" not found
Dec 08 18:33:56 rpi3 k3s[3072]: E1208 18:33:56.673607 3072 kubelet.go:2183] node "rpi3" not found
Dec 08 18:33:56 rpi3 k3s[3072]: E1208 18:33:56.776471 3072 kubelet.go:2183] node "rpi3" not found
Dec 08 18:33:56 rpi3 k3s[3072]: E1208 18:33:56.878741 3072 kubelet.go:2183] node "rpi3" not found
Dec 08 18:33:56 rpi3 k3s[3072]: E1208 18:33:56.980964 3072 kubelet.go:2183] node "rpi3" not found
Dec 08 18:33:57 rpi3 k3s[3072]: E1208 18:33:57.086352 3072 kubelet.go:2183] node "rpi3" not found
Dec 08 18:33:57 rpi3 k3s[3072]: E1208 18:33:57.188085 3072 kubelet.go:2183] node "rpi3" not found
Dec 08 18:33:57 rpi3 k3s[3072]: E1208 18:33:57.290636 3072 kubelet.go:2183] node "rpi3" not found
Dec 08 18:33:57 rpi3 k3s[3072]: E1208 18:33:57.391590 3072 kubelet.go:2183] node "rpi3" not found
Dec 08 18:33:57 rpi3 k3s[3072]: E1208 18:33:57.492055 3072 kubelet.go:2183] node "rpi3" not found
Dec 08 18:33:57 rpi3 k3s[3072]: E1208 18:33:57.592316 3072 kubelet.go:2183] node "rpi3" not found
Dec 08 18:33:57 rpi3 k3s[3072]: E1208 18:33:57.695001 3072 kubelet.go:2183] node "rpi3" not found
Dec 08 18:33:57 rpi3 k3s[3072]: E1208 18:33:57.796136 3072 kubelet.go:2183] node "rpi3" not found
Dec 08 18:33:57 rpi3 k3s[3072]: E1208 18:33:57.898382 3072 kubelet.go:2183] node "rpi3" not found
Dec 08 18:33:57 rpi3 k3s[3072]: E1208 18:33:57.999090 3072 kubelet.go:2183] node "rpi3" not found
Dec 08 18:33:58 rpi3 k3s[3072]: E1208 18:33:58.100080 3072 kubelet.go:2183] node "rpi3" not found
Dec 08 18:33:58 rpi3 k3s[3072]: E1208 18:33:58.200910 3072 kubelet.go:2183] node "rpi3" not found
Dec 08 18:33:58 rpi3 k3s[3072]: E1208 18:33:58.302300 3072 kubelet.go:2183] node "rpi3" not found
Dec 08 18:33:58 rpi3 k3s[3072]: E1208 18:33:58.402672 3072 kubelet.go:2183] node "rpi3" not found
Dec 08 18:33:58 rpi3 k3s[3072]: E1208 18:33:58.503007 3072 kubelet.go:2183] node "rpi3" not found
Dec 08 18:33:58 rpi3 k3s[3072]: time="2020-12-08T18:33:58.567570945Z" level=info msg="Waiting for node rpi3: nodes \"rpi3\" not found"
Dec 08 18:33:58 rpi3 k3s[3072]: E1208 18:33:58.604557 3072 kubelet.go:2183] node "rpi3" not found
Dec 08 18:33:58 rpi3 k3s[3072]: E1208 18:33:58.705283 3072 kubelet.go:2183] node "rpi3" not found
Dec 08 18:33:58 rpi3 k3s[3072]: E1208 18:33:58.806172 3072 kubelet.go:2183] node "rpi3" not found
Dec 08 18:33:58 rpi3 k3s[3072]: E1208 18:33:58.907480 3072 kubelet.go:2183] node "rpi3" not found
Dec 08 18:33:59 rpi3 k3s[3072]: E1208 18:33:59.008020 3072 kubelet.go:2183] node "rpi3" not found
Dec 08 18:33:59 rpi3 k3s[3072]: E1208 18:33:59.113023 3072 kubelet.go:2183] node "rpi3" not found
Dec 08 18:33:59 rpi3 k3s[3072]: E1208 18:33:59.228752 3072 kubelet.go:2183] node "rpi3" not found
Dec 08 18:33:59 rpi3 k3s[3072]: E1208 18:33:59.330816 3072 kubelet.go:2183] node "rpi3" not found
Dec 08 18:33:59 rpi3 k3s[3072]: E1208 18:33:59.434107 3072 kubelet.go:2183] node "rpi3" not found
Dec 08 18:33:59 rpi3 k3s[3072]: E1208 18:33:59.537034 3072 kubelet.go:2183] node "rpi3" not found
Dec 08 18:33:59 rpi3 k3s[3072]: E1208 18:33:59.637700 3072 kubelet.go:2183] node "rpi3" not found
Dec 08 18:33:59 rpi3 k3s[3072]: E1208 18:33:59.738114 3072 kubelet.go:2183] node "rpi3" not found
Dec 08 18:33:59 rpi3 k3s[3072]: E1208 18:33:59.838735 3072 kubelet.go:2183] node "rpi3" not found
Dec 08 18:33:59 rpi3 k3s[3072]: E1208 18:33:59.939357 3072 kubelet.go:2183] node "rpi3" not found
Dec 08 18:34:00 rpi3 k3s[3072]: E1208 18:34:00.041896 3072 kubelet.go:2183] node "rpi3" not found
Dec 08 18:34:00 rpi3 k3s[3072]: E1208 18:34:00.145699 3072 kubelet.go:2183] node "rpi3" not found
Dec 08 18:34:00 rpi3 k3s[3072]: time="2020-12-08T18:34:00.203731067Z" level=warning msg="Unable to watch for tunnel endpoints: unknown (get endpoints)"
Dec 08 18:34:00 rpi3 k3s[3072]: E1208 18:34:00.246757 3072 kubelet.go:2183] node "rpi3" not found
Dec 08 18:34:00 rpi3 k3s[3072]: E1208 18:34:00.348715 3072 kubelet.go:2183] node "rpi3" not found
Dec 08 18:34:00 rpi3 k3s[3072]: E1208 18:34:00.449649 3072 kubelet.go:2183] node "rpi3" not found
Dec 08 18:34:00 rpi3 k3s[3072]: E1208 18:34:00.550054 3072 kubelet.go:2183] node "rpi3" not found
Dec 08 18:34:00 rpi3 k3s[3072]: time="2020-12-08T18:34:00.604340165Z" level=info msg="Waiting for node rpi3: nodes \"rpi3\" not found"
Dec 08 18:34:00 rpi3 k3s[3072]: E1208 18:34:00.652709 3072 kubelet.go:2183] node "rpi3" not found
Dec 08 18:34:00 rpi3 k3s[3072]: E1208 18:34:00.753144 3072 kubelet.go:2183] node "rpi3" not found
Dec 08 18:34:00 rpi3 k3s[3072]: E1208 18:34:00.855938 3072 kubelet.go:2183] node "rpi3" not found
Dec 08 18:34:00 rpi3 k3s[3072]: E1208 18:34:00.956796 3072 kubelet.go:2183] node "rpi3" not found
Dec 08 18:34:01 rpi3 k3s[3072]: E1208 18:34:01.057000 3072 kubelet.go:2183] node "rpi3" not found
Dec 08 18:34:01 rpi3 k3s[3072]: E1208 18:34:01.161374 3072 kubelet.go:2183] node "rpi3" not found
Dec 08 18:34:01 rpi3 k3s[3072]: E1208 18:34:01.262092 3072 kubelet.go:2183] node "rpi3" not found
Dec 08 18:34:01 rpi3 k3s[3072]: E1208 18:34:01.362918 3072 kubelet.go:2183] node "rpi3" not found
Dec 08 18:34:01 rpi3 k3s[3072]: E1208 18:34:01.463555 3072 kubelet.go:2183] node "rpi3" not found
Dec 08 18:34:01 rpi3 k3s[3072]: E1208 18:34:01.564748 3072 kubelet.go:2183] node "rpi3" not found
Dec 08 18:34:01 rpi3 k3s[3072]: E1208 18:34:01.665035 3072 kubelet.go:2183] node "rpi3" not found
Dec 08 18:34:01 rpi3 k3s[3072]: E1208 18:34:01.766035 3072 kubelet.go:2183] node "rpi3" not found
Dec 08 18:34:01 rpi3 k3s[3072]: E1208 18:34:01.867003 3072 kubelet.go:2183] node "rpi3" not found
Dec 08 18:34:01 rpi3 k3s[3072]: E1208 18:34:01.968229 3072 kubelet.go:2183] node "rpi3" not found
Dec 08 18:34:02 rpi3 k3s[3072]: E1208 18:34:02.069559 3072 kubelet.go:2183] node "rpi3" not found
Dec 08 18:34:02 rpi3 k3s[3072]: E1208 18:34:02.170040 3072 kubelet.go:2183] node "rpi3" not found
Dec 08 18:34:02 rpi3 k3s[3072]: E1208 18:34:02.271173 3072 kubelet.go:2183] node "rpi3" not found
Dec 08 18:34:02 rpi3 k3s[3072]: E1208 18:34:02.372425 3072 kubelet.go:2183] node "rpi3" not found
Dec 08 18:34:02 rpi3 k3s[3072]: E1208 18:34:02.475208 3072 kubelet.go:2183] node "rpi3" not found
Dec 08 18:34:02 rpi3 k3s[3072]: E1208 18:34:02.577051 3072 kubelet.go:2183] node "rpi3" not found
Dec 08 18:34:02 rpi3 k3s[3072]: time="2020-12-08T18:34:02.641751483Z" level=info msg="Waiting for node rpi3: nodes \"rpi3\" not found"
Dec 08 18:34:02 rpi3 k3s[3072]: E1208 18:34:02.678934 3072 kubelet.go:2183] node "rpi3" not found
Dec 08 18:34:02 rpi3 k3s[3072]: E1208 18:34:02.781235 3072 kubelet.go:2183] node "rpi3" not found
Dec 08 18:34:02 rpi3 k3s[3072]: E1208 18:34:02.884516 3072 kubelet.go:2183] node "rpi3" not found
Dec 08 18:34:02 rpi3 k3s[3072]: I1208 18:34:02.917120 3072 dynamic_cafile_content.go:167] Starting client-ca-bundle::/var/lib/rancher/k3s/server/tls/client-ca.crt
Dec 08 18:34:02 rpi3 k3s[3072]: I1208 18:34:02.917113 3072 dynamic_cafile_content.go:167] Starting request-header::/var/lib/rancher/k3s/server/tls/request-header-ca.crt
Dec 08 18:34:02 rpi3 k3s[3072]: I1208 18:34:02.920205 3072 dynamic_serving_content.go:130] Starting serving-cert::/var/lib/rancher/k3s/server/tls/serving-kube-apiserver.crt::/var/lib/rancher/k3s/server/tls/serving-kube-apiserver.key
Dec 08 18:34:02 rpi3 k3s[3072]: I1208 18:34:02.926280 3072 secure_serving.go:197] Serving securely on 127.0.0.1:6444
Dec 08 18:34:02 rpi3 k3s[3072]: I1208 18:34:02.926439 3072 tlsconfig.go:240] Starting DynamicServingCertificateController
Dec 08 18:34:02 rpi3 k3s[3072]: I1208 18:34:02.926645 3072 apiservice_controller.go:97] Starting APIServiceRegistrationController
Dec 08 18:34:02 rpi3 k3s[3072]: I1208 18:34:02.926728 3072 cache.go:32] Waiting for caches to sync for APIServiceRegistrationController controller
Dec 08 18:34:02 rpi3 k3s[3072]: I1208 18:34:02.927054 3072 autoregister_controller.go:141] Starting autoregister controller
Dec 08 18:34:02 rpi3 k3s[3072]: I1208 18:34:02.927121 3072 cache.go:32] Waiting for caches to sync for autoregister controller
Dec 08 18:34:02 rpi3 k3s[3072]: I1208 18:34:02.931916 3072 customresource_discovery_controller.go:209] Starting DiscoveryController
Dec 08 18:34:02 rpi3 k3s[3072]: I1208 18:34:02.932264 3072 cluster_authentication_trust_controller.go:440] Starting cluster_authentication_trust_controller controller
Dec 08 18:34:02 rpi3 k3s[3072]: I1208 18:34:02.932365 3072 shared_informer.go:240] Waiting for caches to sync for cluster_authentication_trust_controller
Dec 08 18:34:02 rpi3 k3s[3072]: I1208 18:34:02.932741 3072 dynamic_serving_content.go:130] Starting aggregator-proxy-cert::/var/lib/rancher/k3s/server/tls/client-auth-proxy.crt::/var/lib/rancher/k3s/server/tls/client-auth-proxy.key
Dec 08 18:34:02 rpi3 k3s[3072]: I1208 18:34:02.933190 3072 crdregistration_controller.go:111] Starting crd-autoregister controller
Dec 08 18:34:02 rpi3 k3s[3072]: I1208 18:34:02.933262 3072 shared_informer.go:240] Waiting for caches to sync for crd-autoregister
Dec 08 18:34:02 rpi3 k3s[3072]: I1208 18:34:02.933438 3072 controller.go:86] Starting OpenAPI controller
Dec 08 18:34:02 rpi3 k3s[3072]: I1208 18:34:02.933581 3072 naming_controller.go:291] Starting NamingConditionController
Dec 08 18:34:02 rpi3 k3s[3072]: I1208 18:34:02.933749 3072 establishing_controller.go:76] Starting EstablishingController
Dec 08 18:34:02 rpi3 k3s[3072]: I1208 18:34:02.934036 3072 nonstructuralschema_controller.go:186] Starting NonStructuralSchemaConditionController
Dec 08 18:34:02 rpi3 k3s[3072]: I1208 18:34:02.934201 3072 apiapproval_controller.go:186] Starting KubernetesAPIApprovalPolicyConformantConditionController
Dec 08 18:34:02 rpi3 k3s[3072]: I1208 18:34:02.934340 3072 crd_finalizer.go:266] Starting CRDFinalizer
Dec 08 18:34:02 rpi3 k3s[3072]: I1208 18:34:02.939106 3072 dynamic_cafile_content.go:167] Starting client-ca-bundle::/var/lib/rancher/k3s/server/tls/client-ca.crt
Dec 08 18:34:02 rpi3 k3s[3072]: I1208 18:34:02.939389 3072 available_controller.go:457] Starting AvailableConditionController
Dec 08 18:34:02 rpi3 k3s[3072]: I1208 18:34:02.939470 3072 cache.go:32] Waiting for caches to sync for AvailableConditionController controller
Dec 08 18:34:02 rpi3 k3s[3072]: I1208 18:34:02.939626 3072 controller.go:83] Starting OpenAPI AggregationController
Dec 08 18:34:02 rpi3 k3s[3072]: I1208 18:34:02.939401 3072 dynamic_cafile_content.go:167] Starting request-header::/var/lib/rancher/k3s/server/tls/request-header-ca.crt
Dec 08 18:34:02 rpi3 k3s[3072]: E1208 18:34:02.984866 3072 kubelet.go:2183] node "rpi3" not found
Dec 08 18:34:03 rpi3 k3s[3072]: E1208 18:34:03.091823 3072 kubelet.go:2183] node "rpi3" not found
Dec 08 18:34:03 rpi3 k3s[3072]: E1208 18:34:03.211051 3072 kubelet.go:2183] node "rpi3" not found
Dec 08 18:34:03 rpi3 k3s[3072]: I1208 18:34:03.292313 3072 trace.go:205] Trace[885658523]: "Create" url:/api/v1/namespaces/default/events,user-agent:k3s/v1.19.4+k3s1 (linux/arm64) kubernetes/2532c10,client:127.0.0.1 (08-Dec-2020 18:33:53.285) (total time: 10006ms):
Dec 08 18:34:03 rpi3 k3s[3072]: Trace[885658523]: [10.006453622s] [10.006453622s] END
Dec 08 18:34:03 rpi3 k3s[3072]: E1208 18:34:03.323645 3072 kubelet.go:2183] node "rpi3" not found
Dec 08 18:34:03 rpi3 k3s[3072]: time="2020-12-08T18:34:03.386241102Z" level=info msg="Waiting for cloudcontroller rbac role to be created"
Dec 08 18:34:03 rpi3 k3s[3072]: E1208 18:34:03.445568 3072 kubelet.go:2183] node "rpi3" not found
Dec 08 18:34:03 rpi3 k3s[3072]: E1208 18:34:03.493942 3072 controller.go:228] failed to get node "rpi3" when trying to set owner ref to the node lease: nodes "rpi3" not found
Dec 08 18:34:03 rpi3 k3s[3072]: E1208 18:34:03.510708 3072 event.go:264] Server rejected event '&v1.Event{TypeMeta:v1.TypeMeta{Kind:"", APIVersion:""}, ObjectMeta:v1.ObjectMeta{Name:"rpi3.164ed17649e590c1", GenerateName:"", Namespace:"default", SelfLink:"", UID:"", ResourceVersion:"", Generation:0, CreationTimestamp:v1.Time{Time:time.Time{wall:0x0, ext:0, loc:(*time.Location)(nil)}}, DeletionTimestamp:(*v1.Time)(nil), DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string(nil), Annotations:map[string]string(nil), OwnerReferences:[]v1.OwnerReference(nil), Finalizers:[]string(nil), ClusterName:"", ManagedFields:[]v1.ManagedFieldsEntry(nil)}, InvolvedObject:v1.ObjectReference{Kind:"Node", Namespace:"", Name:"rpi3", UID:"rpi3", APIVersion:"", ResourceVersion:"", FieldPath:""}, Reason:"InvalidDiskCapacity", Message:"invalid capacity 0 on image filesystem", Source:v1.EventSource{Component:"kubelet", Host:"rpi3"}, FirstTimestamp:v1.Time{Time:time.Time{wall:0xbfec0fa1bb3ed4c1, ext:33294385998, loc:(*time.Location)(0x65be780)}}, LastTimestamp:v1.Time{Time:time.Time{wall:0xbfec0fa1bb3ed4c1, ext:33294385998, loc:(*time.Location)(0x65be780)}}, Count:1, Type:"Warning", EventTime:v1.MicroTime{Time:time.Time{wall:0x0, ext:0, loc:(*time.Location)(nil)}}, Series:(*v1.EventSeries)(nil), Action:"", Related:(*v1.ObjectReference)(nil), ReportingController:"", ReportingInstance:""}': 'events "rpi3.164ed17649e590c1" is forbidden: not yet ready to handle request' (will not retry!)
Dec 08 18:34:03 rpi3 k3s[3072]: I1208 18:34:03.527121 3072 cache.go:39] Caches are synced for APIServiceRegistrationController controller
Dec 08 18:34:03 rpi3 k3s[3072]: I1208 18:34:03.527464 3072 cache.go:39] Caches are synced for autoregister controller
Dec 08 18:34:03 rpi3 k3s[3072]: I1208 18:34:03.532656 3072 shared_informer.go:247] Caches are synced for cluster_authentication_trust_controller
Dec 08 18:34:03 rpi3 k3s[3072]: I1208 18:34:03.534442 3072 shared_informer.go:247] Caches are synced for crd-autoregister
Dec 08 18:34:03 rpi3 k3s[3072]: I1208 18:34:03.540657 3072 cache.go:39] Caches are synced for AvailableConditionController controller
Dec 08 18:34:03 rpi3 k3s[3072]: E1208 18:34:03.548786 3072 kubelet.go:2183] node "rpi3" not found
Dec 08 18:34:03 rpi3 k3s[3072]: I1208 18:34:03.620688 3072 trace.go:205] Trace[249778785]: "Create" url:/api/v1/nodes,user-agent:k3s/v1.19.4+k3s1 (linux/arm64) kubernetes/2532c10,client:127.0.0.1 (08-Dec-2020 18:33:54.155) (total time: 9465ms):
Dec 08 18:34:03 rpi3 k3s[3072]: Trace[249778785]: ---"Object stored in database" 9463ms (18:34:00.619)
Dec 08 18:34:03 rpi3 k3s[3072]: Trace[249778785]: [9.465070253s] [9.465070253s] END
Dec 08 18:34:03 rpi3 k3s[3072]: I1208 18:34:03.629508 3072 kubelet_node_status.go:73] Successfully registered node rpi3
Dec 08 18:34:03 rpi3 k3s[3072]: E1208 18:34:03.635735 3072 controller.go:151] Unable to perform initial Kubernetes service initialization: Service "kubernetes" is invalid: spec.clusterIP: Invalid value: "10.43.0.1": cannot allocate resources of type serviceipallocations at this time
Dec 08 18:34:03 rpi3 k3s[3072]: E1208 18:34:03.657323 3072 controller.go:156] Unable to remove old endpoints from kubernetes service: StorageError: key not found, Code: 1, Key: /registry/masterleases/192.168.1.246, ResourceVersion: 0, AdditionalErrorMsg:
Dec 08 18:34:03 rpi3 k3s[3072]: I1208 18:34:03.920374 3072 controller.go:132] OpenAPI AggregationController: action for item : Nothing (removed from the queue).
Dec 08 18:34:03 rpi3 k3s[3072]: I1208 18:34:03.924393 3072 controller.go:132] OpenAPI AggregationController: action for item k8s_internal_local_delegation_chain_0000000000: Nothing (removed from the queue).
Dec 08 18:34:04 rpi3 k3s[3072]: I1208 18:34:04.179528 3072 storage_scheduling.go:134] created PriorityClass system-node-critical with value 2000001000
Dec 08 18:34:04 rpi3 k3s[3072]: I1208 18:34:04.351756 3072 storage_scheduling.go:134] created PriorityClass system-cluster-critical with value 2000000000
Dec 08 18:34:04 rpi3 k3s[3072]: I1208 18:34:04.351911 3072 storage_scheduling.go:143] all system priority classes are created successfully or already exist.
Dec 08 18:34:04 rpi3 k3s[3072]: time="2020-12-08T18:34:04.431836422Z" level=info msg="Waiting for cloudcontroller rbac role to be created"
Dec 08 18:34:04 rpi3 k3s[3072]: time="2020-12-08T18:34:04.676330952Z" level=info msg="Waiting for node rpi3 CIDR not assigned yet"
Dec 08 18:34:05 rpi3 k3s[3072]: time="2020-12-08T18:34:05.214830897Z" level=warning msg="Unable to watch for tunnel endpoints: unknown (get endpoints)"
Dec 08 18:34:05 rpi3 k3s[3072]: time="2020-12-08T18:34:05.473473287Z" level=info msg="Waiting for cloudcontroller rbac role to be created"
Dec 08 18:34:06 rpi3 k3s[3072]: time="2020-12-08T18:34:06.508794519Z" level=info msg="Waiting for cloudcontroller rbac role to be created"
Dec 08 18:34:06 rpi3 k3s[3072]: time="2020-12-08T18:34:06.701369744Z" level=info msg="Waiting for node rpi3 CIDR not assigned yet"
@brandond I have followed your advice and reconfigured the cluster to set up one of the Pi 4 as master. In the new configuration there is one Pi4 as master and the other as worker node (the Pi 3B is unresponsive, I need to troubleshoot what happened to it).
I am getting the same issue as before, if I shutdown the devices following the appropriate procedure when I reboot then the master is up and running but the worker goes into not ready status and never recovers. Immediately after powering up the cluster I run kubectl get nodes
and both appeared running but a minute later it changed and has been unresponsive since.
These are the logs from kubectl describe. Master has a normal log but in the worker no activity is registered whatsoever.
For the rest everything seems to work normally. The pods assigned to the worker has been relaunched in master.
PD: To give more input regarding the hardware, in both Pi's the SD cards are high speed ones.
Logs from pi4blue
Name: pi4blue
Roles: master
Labels: beta.kubernetes.io/arch=arm
beta.kubernetes.io/instance-type=k3s
beta.kubernetes.io/os=linux
k3s.io/hostname=pi4blue
k3s.io/internal-ip=192.168.1.142
kubernetes.io/arch=arm
kubernetes.io/hostname=pi4blue
kubernetes.io/os=linux
node-role.kubernetes.io/master=true
node.kubernetes.io/instance-type=k3s
Annotations: flannel.alpha.coreos.com/backend-data: {"VtepMAC":"fe:ba:e4:25:87:a9"}
flannel.alpha.coreos.com/backend-type: vxlan
flannel.alpha.coreos.com/kube-subnet-manager: true
flannel.alpha.coreos.com/public-ip: 192.168.1.142
k3s.io/node-args: ["server"]
k3s.io/node-config-hash: RCK5KK43QJVI3DZORFHFQOWFMNEM6XGEJNSFOMGJYIVHABPZMHBQ====
k3s.io/node-env: {"K3S_DATA_DIR":"/var/lib/rancher/k3s/data/b46300d70fe21c458e9a951f12a5c6dd86eb7cf2d0b213bb9ad07dbad435207e"}
node.alpha.kubernetes.io/ttl: 0
volumes.kubernetes.io/controller-managed-attach-detach: true
CreationTimestamp: Tue, 08 Dec 2020 17:22:07 +0100
Taints: <none>
Unschedulable: false
Lease:
HolderIdentity: pi4blue
AcquireTime: <unset>
RenewTime: Tue, 08 Dec 2020 19:47:54 +0100
Conditions:
Type Status LastHeartbeatTime LastTransitionTime Reason Message
---- ------ ----------------- ------------------ ------ -------
NetworkUnavailable False Tue, 08 Dec 2020 19:41:41 +0100 Tue, 08 Dec 2020 19:41:41 +0100 FlannelIsUp Flannel is running on this node
MemoryPressure False Tue, 08 Dec 2020 19:46:48 +0100 Tue, 08 Dec 2020 17:22:03 +0100 KubeletHasSufficientMemory kubelet has sufficient memory available
DiskPressure False Tue, 08 Dec 2020 19:46:48 +0100 Tue, 08 Dec 2020 17:22:03 +0100 KubeletHasNoDiskPressure kubelet has no disk pressure
PIDPressure False Tue, 08 Dec 2020 19:46:48 +0100 Tue, 08 Dec 2020 17:22:03 +0100 KubeletHasSufficientPID kubelet has sufficient PID available
Ready True Tue, 08 Dec 2020 19:46:48 +0100 Tue, 08 Dec 2020 17:22:17 +0100 KubeletReady kubelet is posting ready status. WARNING: CPU hardcapping unsupported
Addresses:
InternalIP: 192.168.1.142
Hostname: pi4blue
Capacity:
cpu: 4
ephemeral-storage: 29278068Ki
memory: 4051024Ki
pods: 110
Allocatable:
cpu: 4
ephemeral-storage: 28481704529
memory: 4051024Ki
pods: 110
System Info:
Machine ID: 9003dfb694cb6cb8f8a5b1a95fc57f34
System UUID: 9003dfb694cb6cb8f8a5b1a95fc57f34
Boot ID: b86ebec4-cc7b-44b3-933e-435c8b1a133b
Kernel Version: 4.19.75-v7l+
OS Image: Raspbian GNU/Linux 10 (buster)
Operating System: linux
Architecture: arm
Container Runtime Version: containerd://1.4.1-k3s1
Kubelet Version: v1.19.4+k3s1
Kube-Proxy Version: v1.19.4+k3s1
PodCIDR: 10.42.0.0/24
PodCIDRs: 10.42.0.0/24
ProviderID: k3s://pi4blue
Non-terminated Pods: (9 in total)
Namespace Name CPU Requests CPU Limits Memory Requests Memory Limits AGE
--------- ---- ------------ ---------- --------------- ------------- ---
monitoring prometheus-adapter-585b57857b-s4t9n 0 (0%) 0 (0%) 0 (0%) 0 (0%) 117m
monitoring arm-exporter-6cpf8 60m (1%) 120m (3%) 70Mi (1%) 140Mi (3%) 118m
kube-system local-path-provisioner-7ff9579c6-q96pm 0 (0%) 0 (0%) 0 (0%) 0 (0%) 145m
monitoring node-exporter-xw9l8 112m (2%) 270m (6%) 200Mi (5%) 220Mi (5%) 117m
kube-system metrics-server-7b4f8b595-7wvkr 0 (0%) 0 (0%) 0 (0%) 0 (0%) 145m
kube-system svclb-traefik-5gxj8 0 (0%) 0 (0%) 0 (0%) 0 (0%) 144m
kube-system coredns-66c464876b-zw8rc 100m (2%) 0 (0%) 70Mi (1%) 170Mi (4%) 145m
kube-system traefik-5dd496474-62jf8 0 (0%) 0 (0%) 0 (0%) 0 (0%) 144m
monitoring prometheus-k8s-0 200m (5%) 200m (5%) 450Mi (11%) 50Mi (1%) 117m
Allocated resources:
(Total limits may be over 100 percent, i.e., overcommitted.)
Resource Requests Limits
-------- -------- ------
cpu 472m (11%) 590m (14%)
memory 790Mi (19%) 580Mi (14%)
ephemeral-storage 0 (0%) 0 (0%)
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Starting 6m21s kubelet, pi4blue Starting kubelet.
Warning InvalidDiskCapacity 6m21s kubelet, pi4blue invalid capacity 0 on image filesystem
Normal NodeHasSufficientMemory 6m20s kubelet, pi4blue Node pi4blue status is now: NodeHasSufficientMemory
Normal NodeHasNoDiskPressure 6m20s kubelet, pi4blue Node pi4blue status is now: NodeHasNoDiskPressure
Normal NodeHasSufficientPID 6m20s kubelet, pi4blue Node pi4blue status is now: NodeHasSufficientPID
Normal NodeAllocatableEnforced 6m20s kubelet, pi4blue Updated Node Allocatable limit across pods
Warning Rebooted 6m14s kubelet, pi4blue Node pi4blue has been rebooted, boot id: b86ebec4-cc7b-44b3-933e-435c8b1a133b
Normal Starting 6m13s kube-proxy, pi4blue Starting kube-proxy.
Logs from pi4red
Name: pi4red
Roles: <none>
Labels: beta.kubernetes.io/arch=arm
beta.kubernetes.io/instance-type=k3s
beta.kubernetes.io/os=linux
k3s.io/hostname=pi4red
k3s.io/internal-ip=192.168.1.143
kubernetes.io/arch=arm
kubernetes.io/hostname=pi4red
kubernetes.io/os=linux
node.kubernetes.io/instance-type=k3s
Annotations: flannel.alpha.coreos.com/backend-data: {"VtepMAC":"ae:96:67:9b:4b:92"}
flannel.alpha.coreos.com/backend-type: vxlan
flannel.alpha.coreos.com/kube-subnet-manager: true
flannel.alpha.coreos.com/public-ip: 192.168.1.143
k3s.io/node-args: ["agent"]
k3s.io/node-config-hash: PYSY2K536A6SKSWOUREBPNXZ4NS5NEVH5ZDOE6NXMV5ULKFARC4A====
k3s.io/node-env:
{"K3S_DATA_DIR":"/var/lib/rancher/k3s/data/b46300d70fe21c458e9a951f12a5c6dd86eb7cf2d0b213bb9ad07dbad435207e","K3S_TOKEN":"********","K3S_U...
node.alpha.kubernetes.io/ttl: 0
volumes.kubernetes.io/controller-managed-attach-detach: true
CreationTimestamp: Tue, 08 Dec 2020 17:26:54 +0100
Taints: node.kubernetes.io/unreachable:NoExecute
node.kubernetes.io/unreachable:NoSchedule
Unschedulable: false
Lease:
HolderIdentity: pi4red
AcquireTime: <unset>
RenewTime: Tue, 08 Dec 2020 18:28:07 +0100
Conditions:
Type Status LastHeartbeatTime LastTransitionTime Reason Message
---- ------ ----------------- ------------------ ------ -------
NetworkUnavailable False Tue, 08 Dec 2020 17:26:57 +0100 Tue, 08 Dec 2020 17:26:57 +0100 FlannelIsUp Flannel is running on this node
MemoryPressure Unknown Tue, 08 Dec 2020 18:26:32 +0100 Tue, 08 Dec 2020 19:42:52 +0100 NodeStatusUnknown Kubelet stopped posting node status.
DiskPressure Unknown Tue, 08 Dec 2020 18:26:32 +0100 Tue, 08 Dec 2020 19:42:52 +0100 NodeStatusUnknown Kubelet stopped posting node status.
PIDPressure Unknown Tue, 08 Dec 2020 18:26:32 +0100 Tue, 08 Dec 2020 19:42:52 +0100 NodeStatusUnknown Kubelet stopped posting node status.
Ready Unknown Tue, 08 Dec 2020 18:26:32 +0100 Tue, 08 Dec 2020 19:42:52 +0100 NodeStatusUnknown Kubelet stopped posting node status.
Addresses:
InternalIP: 192.168.1.143
Hostname: pi4red
Capacity:
cpu: 4
ephemeral-storage: 29278068Ki
memory: 4051024Ki
pods: 110
Allocatable:
cpu: 4
ephemeral-storage: 28481704529
memory: 4051024Ki
pods: 110
System Info:
Machine ID: 0c614fc95172029ca90987fe5fc57ee3
System UUID: 0c614fc95172029ca90987fe5fc57ee3
Boot ID: cc7e14dc-bffa-4d03-a837-6754d86c3e01
Kernel Version: 4.19.75-v7l+
OS Image: Raspbian GNU/Linux 10 (buster)
Operating System: linux
Architecture: arm
Container Runtime Version: containerd://1.4.1-k3s1
Kubelet Version: v1.19.4+k3s1
Kube-Proxy Version: v1.19.4+k3s1
PodCIDR: 10.42.2.0/24
PodCIDRs: 10.42.2.0/24
ProviderID: k3s://pi4red
Non-terminated Pods: (7 in total)
Namespace Name CPU Requests CPU Limits Memory Requests Memory Limits AGE
--------- ---- ------------ ---------- --------------- ------------- ---
monitoring arm-exporter-k5hxx 60m (1%) 120m (3%) 70Mi (1%) 140Mi (3%) 119m
monitoring node-exporter-hcwdr 112m (2%) 270m (6%) 200Mi (5%) 220Mi (5%) 119m
kube-system svclb-traefik-qql2l 0 (0%) 0 (0%) 0 (0%) 0 (0%) 142m
monitoring grafana-7cccfc9b5f-8dk6n 100m (2%) 200m (5%) 100Mi (2%) 200Mi (5%) 119m
monitoring kube-state-metrics-6cb6df5d4-qvptw 0 (0%) 0 (0%) 0 (0%) 0 (0%) 119m
monitoring alertmanager-main-0 100m (2%) 100m (2%) 225Mi (5%) 25Mi (0%) 119m
monitoring prometheus-operator-67755f959-m4xgr 100m (2%) 200m (5%) 100Mi (2%) 200Mi (5%) 119m
Allocated resources:
(Total limits may be over 100 percent, i.e., overcommitted.)
Resource Requests Limits
-------- -------- ------
cpu 472m (11%) 890m (22%)
memory 695Mi (17%) 785Mi (19%)
ephemeral-storage 0 (0%) 0 (0%)
Events: <none>
@agilob your response times are way too long. I suspect that your SD card isn't quite up to the task - different vendors and models make way more of a difference than you might suspect. You NEVER want to see times over a couple seconds; at 10 seconds internal Kubernetes components time out and will cause fatal errors that terminate the process. Here are times from your log.
Dec 08 18:34:03 rpi3 k3s[3072]: I1208 18:34:03.292313 3072 trace.go:205] Trace[885658523]: "Create" url:/api/v1/namespaces/default/events,user-agent:k3s/v1.19.4+k3s1 (linux/arm64) kubernetes/2532c10,client:127.0.0.1 (08-Dec-2020 18:33:53.285) (total time: 10006ms):
Dec 08 18:34:03 rpi3 k3s[3072]: Trace[885658523]: [10.006453622s] [10.006453622s] END
--
Dec 08 18:34:03 rpi3 k3s[3072]: I1208 18:34:03.620688 3072 trace.go:205] Trace[249778785]: "Create" url:/api/v1/nodes,user-agent:k3s/v1.19.4+k3s1 (linux/arm64) kubernetes/2532c10,client:127.0.0.1 (08-Dec-2020 18:33:54.155) (total time: 9465ms):
Dec 08 18:34:03 rpi3 k3s[3072]: Trace[249778785]: ---"Object stored in database" 9463ms (18:34:00.619)
Dec 08 18:34:03 rpi3 k3s[3072]: Trace[249778785]: [9.465070253s] [9.465070253s] END
Here are worst-case times from my node:
root@pi03:~# journalctl -u k3s | grep -C 1 'Object stored'
Dec 08 09:48:18 pi03.lan.khaus k3s[2161]: Trace[1030085022]: ---"About to apply patch" 849ms (09:48:00.562)
Dec 08 09:48:18 pi03.lan.khaus k3s[2161]: Trace[1030085022]: ---"Object stored in database" 257ms (09:48:00.827)
Dec 08 09:48:18 pi03.lan.khaus k3s[2161]: Trace[1030085022]: [1.14311659s] [1.14311659s] END
--
Dec 08 09:48:19 pi03.lan.khaus k3s[2161]: I1208 09:48:19.529050 2161 trace.go:205] Trace[1425473484]: "Update" url:/apis/discovery.k8s.io/v1beta1/namespaces/kube-system/endpointslices/traefik-74wwv,user-agent:k3s/v1.19.4+k3s1 (linux/arm) kubernetes/2532c10/system:serviceaccount:kube-system:endpointslice-controller,client:127.0.0.1 (08-Dec-2020 09:48:18.960) (total time: 568ms):
Dec 08 09:48:19 pi03.lan.khaus k3s[2161]: Trace[1425473484]: ---"Object stored in database" 567ms (09:48:00.528)
Dec 08 09:48:19 pi03.lan.khaus k3s[2161]: Trace[1425473484]: [568.118123ms] [568.118123ms] END
--
Dec 08 09:48:19 pi03.lan.khaus k3s[2161]: I1208 09:48:19.616776 2161 trace.go:205] Trace[1254621603]: "Patch" url:/apis/apps/v1/namespaces/kube-system/daemonsets/svclb-traefik,user-agent:k3s/v1.19.4+k3s1 (linux/arm) kubernetes/2532c10,client:127.0.0.1 (08-Dec-2020 09:48:19.037) (total time: 578ms):
Dec 08 09:48:19 pi03.lan.khaus k3s[2161]: Trace[1254621603]: ---"Object stored in database" 555ms (09:48:00.604)
Dec 08 09:48:19 pi03.lan.khaus k3s[2161]: Trace[1254621603]: [578.514837ms] [578.514837ms] END
--
Dec 08 09:48:20 pi03.lan.khaus k3s[2161]: I1208 09:48:20.111368 2161 trace.go:205] Trace[950740528]: "Update" url:/api/v1/namespaces/kube-system/endpoints/rancher.io-local-path,user-agent:local-path-provisioner/v0.0.0 (linux/arm) kubernetes/$Format,client:10.42.0.5 (08-Dec-2020 09:48:19.022) (total time: 1088ms):
Dec 08 09:48:20 pi03.lan.khaus k3s[2161]: Trace[950740528]: ---"Object stored in database" 1087ms (09:48:00.110)
Dec 08 09:48:20 pi03.lan.khaus k3s[2161]: Trace[950740528]: [1.088999454s] [1.088999454s] END
@Miguelerja the describe nodes
output includes events but not logs. Can you get the actual k3s service logs from both nodes?
@brandond Sorry, I am pretty new on this. Here you go.
Logs master journalctl -u k3s
. There are no logs thrown in the worker.
-- Logs begin at Thu 2019-02-14 10:11:59 UTC. --
Dec 08 19:21:36 pi4blue k3s[967]: W1208 19:21:36.218808 967 machine.go:253] Cannot determine CPU /sys/bus/cpu/devices/cpu0 online state, skipping
Dec 08 19:21:36 pi4blue k3s[967]: W1208 19:21:36.218937 967 machine.go:253] Cannot determine CPU /sys/bus/cpu/devices/cpu1 online state, skipping
Dec 08 19:21:36 pi4blue k3s[967]: W1208 19:21:36.219043 967 machine.go:253] Cannot determine CPU /sys/bus/cpu/devices/cpu2 online state, skipping
Dec 08 19:21:36 pi4blue k3s[967]: W1208 19:21:36.219147 967 machine.go:253] Cannot determine CPU /sys/bus/cpu/devices/cpu3 online state, skipping
Dec 08 19:21:36 pi4blue k3s[967]: E1208 19:21:36.219192 967 machine.go:72] Cannot read number of physical cores correctly, number of cores set to 0
Dec 08 19:21:36 pi4blue k3s[967]: W1208 19:21:36.219749 967 machine.go:253] Cannot determine CPU /sys/bus/cpu/devices/cpu0 online state, skipping
Dec 08 19:21:36 pi4blue k3s[967]: W1208 19:21:36.219872 967 machine.go:253] Cannot determine CPU /sys/bus/cpu/devices/cpu1 online state, skipping
Dec 08 19:21:36 pi4blue k3s[967]: W1208 19:21:36.219998 967 machine.go:253] Cannot determine CPU /sys/bus/cpu/devices/cpu2 online state, skipping
Dec 08 19:21:36 pi4blue k3s[967]: W1208 19:21:36.220103 967 machine.go:253] Cannot determine CPU /sys/bus/cpu/devices/cpu3 online state, skipping
Dec 08 19:21:36 pi4blue k3s[967]: E1208 19:21:36.220147 967 machine.go:86] Cannot read number of sockets correctly, number of sockets set to 0
Logs master journalctl -u k3s. There are no logs thrown in the worker.
You need to add --no-pager to remove the stupid line cropping, journalctl -u k3s --no-pager -f
-f
does follow so you get the last lines as they come (keeps reading logs)
@brandond thanks for info, I'm going to try with one faster card now. this time k3s-server
started but rpi run out of memory on sudo kubectl get nodes
it's completely stuck I had to unplug from power.
Thanks @agilob! I edited the last post
There's still not really anything there on the server - can you do journalctl --no-pager -u k3s
and grab the last day or so of logs? Attach them to your comment instead of just pasting them inline on your comment.
On the agent you need to do journalctl --no-pager -u k3s-agent
since agents have a different unit name.
@brandond I think this is it. For master I have reduced a big portion from logs. Also, to get these logs I had to turn on again the cluster and I had no issues on this run, both nodes were running fine from the start...
Those are all the correct logs, but I don't see any errors - probably because all the pods were already running. Pods don't get restarted when k3s is restarted. You might try a full reboot and then collect logs after?
I have just turned on again the cluster after all night off and everything is going down now. These are the logs. I can ssh into both nodes but I cannot ping them and I cannot get any response from kubectl. The SSH also breaks down after a while with broken pipe... I have tried rebooting twice both Pi's but it didn't change anything. No clue what's going on
Your server logs are all truncated...
sorry I was going quickly and having issues connecting into the PI's. I have edited the logs in the comment
So the issues today seem to be errors from my side configuring. All seems to run normal now. I will be closing the issue since I don't think there was anything related to k3s.
Thanks a lot for the support!
@brandond What SD card do you recommend?
I don't have a specific brand, but I have had good luck with Class 10 SDHC cards. Go with larger sizes (64gb should be ok) to allow for wear leveling so they don't get burned out.
OK, that's what I have now sandisk ultra or samsung evo, but 32gb. They're populated in 5gb max. I'm upgrading to rpi4 now, dont want to make mistake and get a card that's too slow.
correction: rpi3 mentioned above has 64gb card SDHC. UHS-1
Environmental Info: K3s Version: v1.19.4+k3s1
Node(s) CPU architecture, OS, and Version: All three Pi's are running on HypriotOS version 1.12 but this issue has been also reproduced on Raspbian lite on its last released version. Master: Linux pi3black 4.19.75-v7+ #1270 SMP Tue Sep 24 18:45:11 BST 2019 armv7l GNU/Linux Node 1: Linux pi4red 4.19.75-v7l+ #1270 SMP Tue Sep 24 18:51:41 BST 2019 armv7l GNU/Linux Node 2: Linux pi4blue 4.19.75-v7l+ #1270 SMP Tue Sep 24 18:51:41 BST 2019 armv7l GNU/Linux
Cluster Configuration: The cluster is built with a Raspberry Pi 3B running as master and two Raspberry Pi 4B (4Gb) as workers. All of them are using a 32Gb microSD.
Describe the bug: After setting up the cluster with K3s as indicated on the official documentation it works well until a reboot is performed. In this case specifically all three Pi's were shutdown following the recommended procedure before disconnecting power. Once the cluster is powered up again, the master node will be ready but both workers will remain in Not ready status.
The only way to get workers back up will be uninstalling and reinstalling k3s in them.
Two more things have been observed:
Steps To Reproduce:
Expected behavior: Worker nodes running after initial boot without having to uninstall and reinstall them.
Actual behavior: After reboot the worker nodes remain not ready.
Additional context / logs: Logs from journalctl -u k3s on master (on the worker nodes there are no logs returned)
Logs from kubectl describe nodes pi3black
Logs from kubectl describe nodes pi4red (same events for the other agent)
Logs from kubectl get nodes -w over 30 minutes after boot