k3s-io / k3s

Lightweight Kubernetes
https://k3s.io
Apache License 2.0
28.08k stars 2.35k forks source link

K3s crashes on startup with error flannel exited: failed to acquire lease: nodes "<Node Name>" is forbidden: not yet ready to handle request #2989

Closed brandond closed 2 years ago

brandond commented 3 years ago

Originally posted by @nirui in https://github.com/k3s-io/k3s/issues/2509#issuecomment-786573486

Not sure if my issue was related.

But I got the same flannel exited: failed to acquire lease: nodes "<Node Name>" is forbidden: not yet ready to handle request error after a reboot, and then the cluster keeps crashing.

I fixed the problem by disabling the builtin flannel and install my own instead. Basically, I changed it to following in my k3s.service:

ExecStart=/usr/local/bin/k3s \
    server \
        .....
        '--flannel-backend=none' \

and then, when the cluster started up, install https://github.com/containernetworking/plugins/releases and run kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml

I don't think this problem was related with the speed of the SD cards, because I used the same card, and the problem just gone after I disabled the built in flannel.

Maybe this will kick you smart & nice guys to look into this issue a little more? blink blink I mean...maybe the built in flannel should ... you know, just keep retrying instead of just crash? :D

nirui commented 3 years ago

Nice, dude!

So... the fix will be available in the next version which will be released in a few days assume? :D

brandond commented 3 years ago

1.21 most likely, unless we opt to backport it to the next 1.20 patch release.

nirui commented 3 years ago

Great! I'll just install the new version when it becomes available. Thank you for the hard work.

rancher-max commented 3 years ago

I'm unable to reproduce this issue with the devices I have at home, so I can't give a good read on validation yet unfortunately. I'll see what I can do about recreating and validating before we release this, but @nirui if you have a test system that you want to give it a try on, you can install from commit id like: curl -sfL https://get.k3s.io | INSTALL_K3S_COMMIT=8ace8975d293bf6eb46e27d207fb667a47d282a5 sh -

brandond commented 3 years ago

@rancher-max I don't have a good way to reproduce this on demand either, since it appears to be a flake caused by slower IO and CPU on the node. I think I'm OK with just closing it out for the moment; we can reopen if someone is able to reproduce it with the current fix applied.

nirui commented 3 years ago

Yeah, you can close this for now if it troubles you. I'll test the fix later this week and post my findings if there is any.

Also, indeed the hardware is slow, but it boots fine with --no-flannel, that is why I think it's weird and worth a report.

nirui commented 3 years ago

So... I just want to come back and report the fix did not work...

I've rebuilt the test environment with two of aforementioned hardware. Both nodes in the cluster were initialized with the specified parameter INSTALL_K3S_COMMIT=8ace8975d293bf6eb46e27d207fb667a47d282a5 as indicated in the comment above, and the agent node was initialized with two additional parameters K3S_URL and K3S_TOKEN.

After confirming the agent node has joined the cluster and the cluster is fully up and operational, I preformed command systemctl restart k3s on the control node. The command failed, and produced following error logs:

k3s[15660]: I0312 19:42:25.613035   15660 trace.go:205] Trace[1541134339]: "List etcd3" key:/traefik.containo.us/traefikservices,resourceVersion:0,resourceVersionMatch:,limit:500,continue: (12-Mar-2021 19:42:24.878) (total time: 734ms):
k3s[15660]: Trace[1541134339]: [734.623511ms] [734.623511ms] END
k3s[15660]: I0312 19:42:25.620330   15660 trace.go:205] Trace[1848616586]: "List" url:/apis/traefik.containo.us/v1alpha1/traefikservices,user-agent:traefik/2.4.2 (linux/arm) kubernetes/crd,client:10.42.0.2 (12-Mar-2021 19:42:24.877) (total time: 742ms):
k3s[15660]: Trace[1848616586]: ---"Listing from storage done" 741ms (19:42:00.619)
k3s[15660]: Trace[1848616586]: [742.340319ms] [742.340319ms] END
k3s[15660]: I0312 19:42:25.656805   15660 trace.go:205] Trace[1149487375]: "List etcd3" key:/traefik.containo.us/traefikservices,resourceVersion:,resourceVersionMatch:,limit:10000,continue: (12-Mar-2021 19:42:24.888) (total time: 767ms):
k3s[15660]: Trace[1149487375]: [767.912176ms] [767.912176ms] END
k3s[15660]: E0312 19:42:26.858698   15660 controller.go:156] Unable to remove old endpoints from kubernetes service: no master IPs were listed in storage, refusing to erase all endpoints for the kubernetes service
k3s[15660]: E0312 19:42:28.153220   15660 available_controller.go:508] v1beta1.metrics.k8s.io failed with: failing or missing response from https://10.43.247.207:443/apis/metrics.k8s.io/v1beta1: Get "https://10.43.247.207:443/apis/metrics.k8s.io/v1beta1": dial tcp 10.43.247.207:443: connect: no route to host
k3s[15660]: E0312 19:42:31.240932   15660 available_controller.go:508] v1beta1.metrics.k8s.io failed with: failing or missing response from https://10.43.247.207:443/apis/metrics.k8s.io/v1beta1: Get "https://10.43.247.207:443/apis/metrics.k8s.io/v1beta1": dial tcp 10.43.247.207:443: connect: no route to host
k3s[15660]: F0312 19:42:32.031358   15660 controllermanager.go:168] error building controller context: failed to wait for apiserver being healthy: timed out waiting for the condition: failed to get apiserver /healthz status: an error on the server ("[+]ping ok\n[+]log ok\n[+]etcd ok\n[+]poststarthook/start-kube-apiserver-admission-initializer ok\n[+]poststarthook/generic-apiserver-start-informers ok\n[+]poststarthook/priority-and-fairness-config-consumer ok\n[+]poststarthook/priority-and-fairness-filter ok\n[+]poststarthook/start-apiextensions-informers ok\n[+]poststarthook/start-apiextensions-controllers ok\n[+]poststarthook/crd-informer-synced ok\n[+]poststarthook/bootstrap-controller ok\n[-]poststarthook/rbac/bootstrap-roles failed: reason withheld\n[+]poststarthook/scheduling/bootstrap-system-priority-classes ok\n[+]poststarthook/priority-and-fairness-config-producer ok\n[+]poststarthook/start-cluster-authentication-info-controller ok\n[+]poststarthook/aggregator-reload-proxy-client-cert ok\n[+]poststarthook/start-kube-aggregator-informers ok\n[-]poststarthook/apiservice-registration-controller failed: reason withheld\n[+]poststarthook/apiservice-status-available-controller ok\n[+]poststarthook/kube-apiserver-autoregistration ok\n[+]autoregister-completion ok\n[+]poststarthook/apiservice-openapi-controller ok\nhealthz check failed") has prevented the request from succeeding
k3s[15660]: goroutine 7019 [running]:
k3s[15660]: github.com/rancher/k3s/vendor/k8s.io/klog/v2.stacks(0x59e4e01, 0x0, 0x56d, 0x5b4)
k3s[15660]:         /go/src/github.com/rancher/k3s/vendor/k8s.io/klog/v2/klog.go:1026 +0x94
k3s[15660]: github.com/rancher/k3s/vendor/k8s.io/klog/v2.(*loggingT).output(0x59ceab8, 0x3, 0x0, 0x0, 0xcec8a20, 0x56bdc7e, 0x14, 0xa8, 0x0)
k3s[15660]:         /go/src/github.com/rancher/k3s/vendor/k8s.io/klog/v2/klog.go:975 +0x110
<Stack trace follows>

Based on the log, I suspected the new error was caused by the hardcoded timeout introduced by commit f970e49b7d37f642150bcdcbbc4c7da63ea0eb8f was too short. So I cloned this repository (at 8ace8975d293bf6eb46e27d207fb667a47d282a5), modified the timeout to 1 hour long, then recompiled it by following this instruction. The command systemctl stop k3s and k3s-killall.sh was called to shutdown the failing control node, the newly complied binary is deployed, then systemctl restart k3s. But the same error still occurred.

(Well, the detail is, I've built 3 binaries, one directly from the source, one with 1hour timeout, one wraps the wait call in a infinite try loop. Each one was built by invoking SKIP_VALIDATE=true make after the modified code is saved. The 1hour timeout one is the one that is actually useful for the test. One weird thing is, all 3 binaries has exactly the same file size but different binary content)

Fetching the cluster info with kubectl repeatedly returns following:

~# kubectl get all --all-namespaces
E0312 20:22:34.674438   21761 request.go:1011] Unexpected error when reading response body: unexpected EOF
unexpected error when reading response body. Please retry. Original error: unexpected EOF
The connection to the server 127.0.0.1:6443 was refused - did you specify the right host or port?
The connection to the server 127.0.0.1:6443 was refused - did you specify the right host or port?
The connection to the server 127.0.0.1:6443 was refused - did you specify the right host or port?
The connection to the server 127.0.0.1:6443 was refused - did you specify the right host or port?
The connection to the server 127.0.0.1:6443 was refused - did you specify the right host or port?
The connection to the server 127.0.0.1:6443 was refused - did you specify the right host or port?
The connection to the server 127.0.0.1:6443 was refused - did you specify the right host or port?
The connection to the server 127.0.0.1:6443 was refused - did you specify the right host or port?
The connection to the server 127.0.0.1:6443 was refused - did you specify the right host or port?
~# kubectl get all --all-namespaces
Error from server (ServiceUnavailable): the server is currently unable to handle the request (get pods)
Error from server (ServiceUnavailable): the server is currently unable to handle the request (get replicationcontrollers)
Error from server (ServiceUnavailable): the server is currently unable to handle the request (get services)
Error from server (ServiceUnavailable): the server is currently unable to handle the request (get daemonsets.apps)
Error from server (ServiceUnavailable): the server is currently unable to handle the request (get deployments.apps)
Error from server (ServiceUnavailable): the server is currently unable to handle the request (get replicasets.apps)
Error from server (ServiceUnavailable): the server is currently unable to handle the request (get statefulsets.apps)
Error from server (ServiceUnavailable): the server is currently unable to handle the request (get horizontalpodautoscalers.autoscaling)
Error from server (ServiceUnavailable): the server is currently unable to handle the request (get jobs.batch)
Error from server (ServiceUnavailable): the server is currently unable to handle the request (get cronjobs.batch)
~# kubectl get all --all-namespaces
NAMESPACE     NAME                                          READY   STATUS      RESTARTS   AGE
kube-system   pod/helm-install-traefik-crd-6fqnp            0/1     Completed   0          131m
kube-system   pod/helm-install-traefik-lgq9d                0/1     Completed   2          131m
kube-system   pod/svclb-traefik-dgncc                       2/2     Running     0          119m
kube-system   pod/local-path-provisioner-5ff76fc89d-7brdn   0/1     Error       9          131m
kube-system   pod/metrics-server-86cbb8457f-5m2b6           0/1     Error       5          131m
kube-system   pod/traefik-8469c8586b-fvdnx                  0/1     Unknown     2          125m
kube-system   pod/coredns-854c77959c-s6gzb                  0/1     Unknown     2          131m
kube-system   pod/svclb-traefik-54wzp                       0/2     Unknown     4          124m

NAMESPACE     NAME                     TYPE           CLUSTER-IP      EXTERNAL-IP                     PORT(S)                      AGE
default       service/kubernetes       ClusterIP      10.43.0.1       <none>                          443/TCP                      132m
kube-system   service/kube-dns         ClusterIP      10.43.0.10      <none>                          53/UDP,53/TCP,9153/TCP       131m
kube-system   service/metrics-server   ClusterIP      10.43.247.207   <none>                          443/TCP                      131m
kube-system   service/traefik          LoadBalancer   10.43.16.86     10.220.179.140,10.220.179.253   80:30399/TCP,443:31096/TCP   125m
~# kubectl get all --all-namespaces
Error from server (ServiceUnavailable): the server is currently unable to handle the request (get pods)
Error from server (ServiceUnavailable): the server is currently unable to handle the request (get replicationcontrollers)
Error from server (ServiceUnavailable): the server is currently unable to handle the request (get services)
Error from server (ServiceUnavailable): the server is currently unable to handle the request (get daemonsets.apps)
Error from server (ServiceUnavailable): the server is currently unable to handle the request (get deployments.apps)
Error from server (ServiceUnavailable): the server is currently unable to handle the request (get replicasets.apps)
Error from server (ServiceUnavailable): the server is currently unable to handle the request (get statefulsets.apps)
Error from server (ServiceUnavailable): the server is currently unable to handle the request (get horizontalpodautoscalers.autoscaling)
Error from server (ServiceUnavailable): the server is currently unable to handle the request (get jobs.batch)
Error from server (ServiceUnavailable): the server is currently unable to handle the request (get cronjobs.batch)

Of course, the --no-flannel trick no longer works because the boot up is now stopped by the APIServer waiter.

Here is some additional information which may or may not be useful for this case:

~# cat /etc/systemd/system/k3s.service
[Unit]
Description=Lightweight Kubernetes
Documentation=https://k3s.io
Wants=network-online.target
After=network-online.target

[Install]
WantedBy=multi-user.target

[Service]
Type=notify
EnvironmentFile=/etc/systemd/system/k3s.service.env
KillMode=process
Delegate=yes
# Having non-zero Limit*s causes performance problems due to accounting overhead
# in the kernel. We recommend using cgroups to do container-local accounting.
LimitNOFILE=1048576
LimitNPROC=infinity
LimitCORE=infinity
TasksMax=infinity
TimeoutStartSec=0
Restart=always
RestartSec=5s
ExecStartPre=-/sbin/modprobe br_netfilter
ExecStartPre=-/sbin/modprobe overlay
ExecStart=/usr/local/bin/k3s \
    server \

~# crictl --runtime-endpoint unix:///var/run/k3s/containerd/containerd.sock images
IMAGE                                      TAG                 IMAGE ID            SIZE
docker.io/rancher/coredns-coredns          1.8.0               a0ce6ab869a69       11.9MB
docker.io/rancher/klipper-helm             v0.4.3              0bdabf617c29a       47.7MB
docker.io/rancher/klipper-lb               v0.1.2              7d23a14d38d24       2.58MB
docker.io/rancher/library-traefik          2.4.2               253e6b02a96a7       27.1MB
docker.io/rancher/local-path-provisioner   v0.0.19             1e695755cc09d       12.7MB
docker.io/rancher/metrics-server           v0.3.6              d24dd28770a36       10.2MB
docker.io/rancher/pause                    3.1                 e11a8cbeda868       231kB
~# crictl --runtime-endpoint unix:///var/run/k3s/containerd/containerd.sock ps -a
CONTAINER           IMAGE               CREATED                  STATE               NAME                     ATTEMPT             POD ID
91e1cd4428fb9       d24dd28770a36       Less than a second ago   Created             metrics-server           7                   29403bfaa3bd9
2eeaa1dc81238       253e6b02a96a7       2 minutes ago            Running             traefik                  3                   00456951ebfe4
5ec982f2a8bfa       1e695755cc09d       2 minutes ago            Exited              local-path-provisioner   10                  38357e8dd61a5
3c1adbeafaaec       7d23a14d38d24       2 minutes ago            Running             lb-port-443              3                   7bec24357cb31
c1d6c2f9a9556       7d23a14d38d24       2 minutes ago            Running             lb-port-80               3                   7bec24357cb31
f6f68a4a980e2       d24dd28770a36       2 minutes ago            Exited              metrics-server           6                   29403bfaa3bd9
fa8aeb910f29e       a0ce6ab869a69       2 minutes ago            Running             coredns                  3                   467757cf44fb1
7d5c21df93ecd       7d23a14d38d24       44 minutes ago           Exited              lb-port-443              2                   06a3b1553fedc
a8db82089690b       7d23a14d38d24       44 minutes ago           Exited              lb-port-80               2                   06a3b1553fedc
d1823cf986ede       253e6b02a96a7       44 minutes ago           Exited              traefik                  2                   a8d5c4f1fb017
1f29f7922d8c1       a0ce6ab869a69       44 minutes ago           Exited              coredns                  2                   ffc95c04b46a1
70bf79328349d       0bdabf617c29a       2 hours ago              Exited              helm                     2                   9fc123776a44f
2cee92c5db469       0bdabf617c29a       2 hours ago              Exited              helm                     0                   2536ab7078e44

Now, I appreciate the effort been put onto this. As I understood it, this problem only impact those devices which are too old&slow to run Kubernetes anyway. Add in the fact that you guys don't have the exact device, and there are only 2 reports related to this (, and the another guy stopped responding long ago), I think maybe it's not a problem worth fixing. I mean, I will not feel pissed if it comes to that.

So... that's all from me so far. Again, Thank you!

brandond commented 3 years ago

[-]poststarthook/rbac/bootstrap-roles failed: reason withheld

I'm not sure why this particular item blocks on slow nodes. It's basically responsible for ensuring that all the core RBAC exists every time the cluster starts up; my guess is that it fails in some non-recoverable way if etcd is running slowly.

nirui commented 3 years ago

Is there anything for me to do/test in order to help the rbac situation? Before I completely tear down the test cluster setup as well as my broken little heart? 🙃

brandond commented 3 years ago

If you start with increased verbosity (--v=2 should do it I think) it will tell you why the rbac hook is not ready when it gets to that point. This is all deep core Kubernetes code though and not likely to be something we can fix directly.

nirui commented 3 years ago

Understood.

I don't really have enough knowledge to debug the inner workings of Kubernetes. So now I'm just assume it's either the hardware cannot run Kubernetes or me been too dumb to operate it.

I'll continue experiment on this cluster to figure out things a little bit more, but I don't think there is anything worth reporting any more.

Here's some little more information that I've collected during my test, I think I'll just paste them there to help the future humans to learn my struggle and stupidities for their own entertainment should they visit the North Pole to see the artifacts. Because it probably won't help others much. Anyways...

The log that I captured during the crash cycles: k3s-service-crashes.log

Here's me calling kubectl get nodes repeatedly, it seems the API Server worked for few minutes before been closed:

root@cubie0:~# kubectl get nodes
NAME     STATUS   ROLES                  AGE    VERSION
cubie1   Ready    <none>                 35m    v1.20.4+k3s-8ace8975
cubie0   Ready    control-plane,master   142m   v1.20.4+k3s-8ace8975
root@cubie0:~# kubectl get nodes
The connection to the server 127.0.0.1:6443 was refused - did you specify the right host or port?
root@cubie0:~# kubectl get nodes
The connection to the server 127.0.0.1:6443 was refused - did you specify the right host or port?
root@cubie0:~# kubectl get nodes
Error from server (ServiceUnavailable): the server is currently unable to handle the request
root@cubie0:~# kubectl get nodes
Error from server (ServiceUnavailable): the server is currently unable to handle the request
root@cubie0:~# kubectl get nodes
NAME     STATUS   ROLES                  AGE    VERSION
cubie1   Ready    <none>                 36m    v1.20.4+k3s-8ace8975
cubie0   Ready    control-plane,master   143m   v1.20.4+k3s-8ace8975
root@cubie0:~# kubectl get nodes
NAME     STATUS   ROLES                  AGE    VERSION
cubie1   Ready    <none>                 36m    v1.20.4+k3s-8ace8975
cubie0   Ready    control-plane,master   143m   v1.20.4+k3s-8ace8975
root@cubie0:~# kubectl get nodes
NAME     STATUS   ROLES                  AGE    VERSION
cubie1   Ready    <none>                 36m    v1.20.4+k3s-8ace8975
cubie0   Ready    control-plane,master   143m   v1.20.4+k3s-8ace8975
root@cubie0:~# kubectl get nodes
NAME     STATUS   ROLES                  AGE    VERSION
cubie1   Ready    <none>                 36m    v1.20.4+k3s-8ace8975
cubie0   Ready    control-plane,master   143m   v1.20.4+k3s-8ace8975
root@cubie0:~# kubectl get nodes
NAME     STATUS   ROLES                  AGE    VERSION
cubie1   Ready    <none>                 36m    v1.20.4+k3s-8ace8975
cubie0   Ready    control-plane,master   143m   v1.20.4+k3s-8ace8975
root@cubie0:~# kubectl get nodes
NAME     STATUS   ROLES                  AGE    VERSION
cubie1   Ready    <none>                 36m    v1.20.4+k3s-8ace8975
cubie0   Ready    control-plane,master   143m   v1.20.4+k3s-8ace8975
root@cubie0:~# kubectl get nodes
NAME     STATUS   ROLES                  AGE    VERSION
cubie1   Ready    <none>                 37m    v1.20.4+k3s-8ace8975
cubie0   Ready    control-plane,master   143m   v1.20.4+k3s-8ace8975
root@cubie0:~# kubectl get nodes
NAME     STATUS   ROLES                  AGE    VERSION
cubie1   Ready    <none>                 37m    v1.20.4+k3s-8ace8975
cubie0   Ready    control-plane,master   144m   v1.20.4+k3s-8ace8975
root@cubie0:~# kubectl get nodes
NAME     STATUS   ROLES                  AGE    VERSION
cubie1   Ready    <none>                 37m    v1.20.4+k3s-8ace8975
cubie0   Ready    control-plane,master   144m   v1.20.4+k3s-8ace8975
root@cubie0:~# kubectl get nodes
NAME     STATUS   ROLES                  AGE    VERSION
cubie1   Ready    <none>                 37m    v1.20.4+k3s-8ace8975
cubie0   Ready    control-plane,master   144m   v1.20.4+k3s-8ace8975
root@cubie0:~# kubectl get nodes
NAME     STATUS   ROLES                  AGE    VERSION
cubie1   Ready    <none>                 37m    v1.20.4+k3s-8ace8975
cubie0   Ready    control-plane,master   144m   v1.20.4+k3s-8ace8975
root@cubie0:~# kubectl get nodes
NAME     STATUS   ROLES                  AGE    VERSION
cubie1   Ready    <none>                 37m    v1.20.4+k3s-8ace8975
cubie0   Ready    control-plane,master   144m   v1.20.4+k3s-8ace8975
root@cubie0:~# kubectl get nodes
NAME     STATUS   ROLES                  AGE    VERSION
cubie1   Ready    <none>                 37m    v1.20.4+k3s-8ace8975
cubie0   Ready    control-plane,master   144m   v1.20.4+k3s-8ace8975
root@cubie0:~# kubectl get nodes
NAME     STATUS   ROLES                  AGE    VERSION
cubie1   Ready    <none>                 37m    v1.20.4+k3s-8ace8975
cubie0   Ready    control-plane,master   144m   v1.20.4+k3s-8ace8975
root@cubie0:~# kubectl get nodes
NAME     STATUS   ROLES                  AGE    VERSION
cubie1   Ready    <none>                 37m    v1.20.4+k3s-8ace8975
cubie0   Ready    control-plane,master   144m   v1.20.4+k3s-8ace8975
root@cubie0:~# kubectl get nodes
NAME     STATUS   ROLES                  AGE    VERSION
cubie1   Ready    <none>                 37m    v1.20.4+k3s-8ace8975
cubie0   Ready    control-plane,master   144m   v1.20.4+k3s-8ace8975
root@cubie0:~# kubectl get nodes
NAME     STATUS   ROLES                  AGE    VERSION
cubie1   Ready    <none>                 38m    v1.20.4+k3s-8ace8975
cubie0   Ready    control-plane,master   145m   v1.20.4+k3s-8ace8975
root@cubie0:~# kubectl get nodes
NAME     STATUS   ROLES                  AGE    VERSION
cubie0   Ready    control-plane,master   145m   v1.20.4+k3s-8ace8975
cubie1   Ready    <none>                 38m    v1.20.4+k3s-8ace8975
root@cubie0:~# kubectl get nodes
NAME     STATUS   ROLES                  AGE    VERSION
cubie0   Ready    control-plane,master   145m   v1.20.4+k3s-8ace8975
cubie1   Ready    <none>                 38m    v1.20.4+k3s-8ace8975
root@cubie0:~# kubectl get nodes
The connection to the server 127.0.0.1:6443 was refused - did you specify the right host or port?
root@cubie0:~# kubectl get nodes
The connection to the server 127.0.0.1:6443 was refused - did you specify the right host or port?
root@cubie0:~# kubectl get nodes
The connection to the server 127.0.0.1:6443 was refused - did you specify the right host or port?
root@cubie0:~# 

And the output of kubectl describe nodes when it works:

root@cubie0:~# kubectl describe nodes
Name:               cubie1
Roles:              <none>
Labels:             beta.kubernetes.io/arch=arm
                    beta.kubernetes.io/instance-type=k3s
                    beta.kubernetes.io/os=linux
                    k3s.io/hostname=cubie1
                    k3s.io/internal-ip=10.220.179.140
                    kubernetes.io/arch=arm
                    kubernetes.io/hostname=cubie1
                    kubernetes.io/os=linux
                    node.kubernetes.io/instance-type=k3s
Annotations:        flannel.alpha.coreos.com/backend-data: {"VtepMAC":"d2:1b:14:37:80:91"}
                    flannel.alpha.coreos.com/backend-type: vxlan
                    flannel.alpha.coreos.com/kube-subnet-manager: true
                    flannel.alpha.coreos.com/public-ip: 10.220.179.140
                    k3s.io/node-args: ["agent"]
                    k3s.io/node-config-hash: N624NCONM6NMLAYPK4RLWKY52UNBEJFLO2JFY7Q6NP6QEBN6GZYA====
                    k3s.io/node-env:
                    {"K3S_DATA_DIR":"/var/lib/rancher/k3s/data/3f87137d37e82a44b14b1e280186ecc6b29bf888a9730cc6c907a4c68426b5d4","K3S_TOKEN":"********","K3S_U...
                    node.alpha.kubernetes.io/ttl: 0
                    volumes.kubernetes.io/controller-managed-attach-detach: true
CreationTimestamp:  Sun, 14 Mar 2021 21:46:11 +0800
Taints:             <none>
Unschedulable:      false
Lease:
HolderIdentity:  cubie1
AcquireTime:     <unset>
RenewTime:       Sun, 14 Mar 2021 22:57:07 +0800
Conditions:
Type                 Status  LastHeartbeatTime                 LastTransitionTime                Reason                       Message
----                 ------  -----------------                 ------------------                ------                       -------
NetworkUnavailable   False   Sun, 14 Mar 2021 21:46:21 +0800   Sun, 14 Mar 2021 21:46:21 +0800   FlannelIsUp                  Flannel is running on this node
MemoryPressure       False   Sun, 14 Mar 2021 22:52:56 +0800   Sun, 14 Mar 2021 21:46:07 +0800   KubeletHasSufficientMemory   kubelet has sufficient memory available
DiskPressure         False   Sun, 14 Mar 2021 22:52:56 +0800   Sun, 14 Mar 2021 21:46:07 +0800   KubeletHasNoDiskPressure     kubelet has no disk pressure
PIDPressure          False   Sun, 14 Mar 2021 22:52:56 +0800   Sun, 14 Mar 2021 21:46:07 +0800   KubeletHasSufficientPID      kubelet has sufficient PID available
Ready                True    Sun, 14 Mar 2021 22:52:56 +0800   Sun, 14 Mar 2021 21:46:16 +0800   KubeletReady                 kubelet is posting ready status
Addresses:
InternalIP:  10.220.179.140
Hostname:    cubie1
Capacity:
cpu:                2
ephemeral-storage:  7458672Ki
memory:             1022940Ki
pods:               110
Allocatable:
cpu:                2
ephemeral-storage:  7255796116
memory:             1022940Ki
pods:               110
System Info:
Machine ID:                 59a23be25c384437ad3f08c9585a8d94
System UUID:                59a23be25c384437ad3f08c9585a8d94
Boot ID:                    e4867dfb-de3f-4eb7-91d2-f2eb47635e81
Kernel Version:             5.10.16-sunxi
OS Image:                   Armbian 21.02.2 Buster
Operating System:           linux
Architecture:               arm
Container Runtime Version:  containerd://1.4.3-k3s3
Kubelet Version:            v1.20.4+k3s-8ace8975
Kube-Proxy Version:         v1.20.4+k3s-8ace8975
PodCIDR:                      10.42.1.0/24
PodCIDRs:                     10.42.1.0/24
ProviderID:                   k3s://cubie1
Non-terminated Pods:          (1 in total)
Namespace                   Name                   CPU Requests  CPU Limits  Memory Requests  Memory Limits  AGE
---------                   ----                   ------------  ----------  ---------------  -------------  ---
kube-system                 svclb-traefik-l6z64    0 (0%)        0 (0%)      0 (0%)           0 (0%)         72m
Allocated resources:
(Total limits may be over 100 percent, i.e., overcommitted.)
Resource           Requests  Limits
--------           --------  ------
cpu                0 (0%)    0 (0%)
memory             0 (0%)    0 (0%)
ephemeral-storage  0 (0%)    0 (0%)
Events:
Type     Reason                   Age                From        Message
----     ------                   ----               ----        -------
Normal   Starting                 72m                kubelet     Starting kubelet.
Warning  InvalidDiskCapacity      72m                kubelet     invalid capacity 0 on image filesystem
Normal   NodeHasSufficientMemory  72m (x2 over 72m)  kubelet     Node cubie1 status is now: NodeHasSufficientMemory
Normal   NodeHasNoDiskPressure    72m (x2 over 72m)  kubelet     Node cubie1 status is now: NodeHasNoDiskPressure
Normal   NodeHasSufficientPID     72m (x2 over 72m)  kubelet     Node cubie1 status is now: NodeHasSufficientPID
Normal   NodeAllocatableEnforced  72m                kubelet     Updated Node Allocatable limit across pods
Normal   Starting                 72m                kube-proxy  Starting kube-proxy.
Normal   NodeReady                72m                kubelet     Node cubie1 status is now: NodeReady

Name:               cubie0
Roles:              control-plane,master
Labels:             beta.kubernetes.io/arch=arm
                    beta.kubernetes.io/instance-type=k3s
                    beta.kubernetes.io/os=linux
                    k3s.io/hostname=cubie0
                    k3s.io/internal-ip=10.220.179.253
                    kubernetes.io/arch=arm
                    kubernetes.io/hostname=cubie0
                    kubernetes.io/os=linux
                    node-role.kubernetes.io/control-plane=true
                    node-role.kubernetes.io/master=true
                    node.kubernetes.io/instance-type=k3s
Annotations:        flannel.alpha.coreos.com/backend-data: {"VtepMAC":"9a:7f:74:c6:35:76"}
                    flannel.alpha.coreos.com/backend-type: vxlan
                    flannel.alpha.coreos.com/kube-subnet-manager: true
                    flannel.alpha.coreos.com/public-ip: 10.220.179.253
                    k3s.io/node-args: ["server","--v","2"]
                    k3s.io/node-config-hash: VKQF3MXOHEN3R6NWURA5F3LPB262NQABQCUONGKU3RRZWIRVTJTQ====
                    k3s.io/node-env: {"K3S_DATA_DIR":"/var/lib/rancher/k3s/data/3f87137d37e82a44b14b1e280186ecc6b29bf888a9730cc6c907a4c68426b5d4"}
                    node.alpha.kubernetes.io/ttl: 0
                    volumes.kubernetes.io/controller-managed-attach-detach: true
CreationTimestamp:  Sun, 14 Mar 2021 19:59:17 +0800
Taints:             <none>
Unschedulable:      false
Lease:
HolderIdentity:  cubie0
AcquireTime:     <unset>
RenewTime:       Sun, 14 Mar 2021 22:57:02 +0800
Conditions:
Type                 Status  LastHeartbeatTime                 LastTransitionTime                Reason                       Message
----                 ------  -----------------                 ------------------                ------                       -------
NetworkUnavailable   False   Sun, 14 Mar 2021 22:56:36 +0800   Sun, 14 Mar 2021 22:56:36 +0800   FlannelIsUp                  Flannel is running on this node
MemoryPressure       False   Sun, 14 Mar 2021 22:56:55 +0800   Sun, 14 Mar 2021 20:04:44 +0800   KubeletHasSufficientMemory   kubelet has sufficient memory available
DiskPressure         False   Sun, 14 Mar 2021 22:56:55 +0800   Sun, 14 Mar 2021 20:04:44 +0800   KubeletHasNoDiskPressure     kubelet has no disk pressure
PIDPressure          False   Sun, 14 Mar 2021 22:56:55 +0800   Sun, 14 Mar 2021 20:04:44 +0800   KubeletHasSufficientPID      kubelet has sufficient PID available
Ready                True    Sun, 14 Mar 2021 22:56:55 +0800   Sun, 14 Mar 2021 20:04:44 +0800   KubeletReady                 kubelet is posting ready status
Addresses:
InternalIP:  10.220.179.253
Hostname:    cubie0
Capacity:
cpu:                2
ephemeral-storage:  60191424Ki
memory:             1022940Ki
pods:               110
Allocatable:
cpu:                2
ephemeral-storage:  58554217222
memory:             1022940Ki
pods:               110
System Info:
Machine ID:                 59a23be25c384437ad3f08c9585a8d94
System UUID:                59a23be25c384437ad3f08c9585a8d94
Boot ID:                    d02739f8-8ee0-4d4d-befa-73d70d405b4e
Kernel Version:             5.10.16-sunxi
OS Image:                   Armbian 21.02.2 Buster
Operating System:           linux
Architecture:               arm
Container Runtime Version:  containerd://1.4.3-k3s3
Kubelet Version:            v1.20.4+k3s-8ace8975
Kube-Proxy Version:         v1.20.4+k3s-8ace8975
PodCIDR:                      10.42.0.0/24
PodCIDRs:                     10.42.0.0/24
ProviderID:                   k3s://cubie0
Non-terminated Pods:          (5 in total)
Namespace                   Name                                       CPU Requests  CPU Limits  Memory Requests  Memory Limits  AGE
---------                   ----                                       ------------  ----------  ---------------  -------------  ---
kube-system                 svclb-traefik-7ffdx                        0 (0%)        0 (0%)      0 (0%)           0 (0%)         171m
kube-system                 metrics-server-86cbb8457f-x9bxt            0 (0%)        0 (0%)      0 (0%)           0 (0%)         177m
kube-system                 coredns-854c77959c-4kg9l                   100m (5%)     0 (0%)      70Mi (7%)        170Mi (17%)    177m
kube-system                 traefik-8469c8586b-cj7c8                   0 (0%)        0 (0%)      0 (0%)           0 (0%)         172m
kube-system                 local-path-provisioner-5ff76fc89d-v58mg    0 (0%)        0 (0%)      0 (0%)           0 (0%)         177m
Allocated resources:
(Total limits may be over 100 percent, i.e., overcommitted.)
Resource           Requests   Limits
--------           --------   ------
cpu                100m (5%)  0 (0%)
memory             70Mi (7%)  170Mi (17%)
ephemeral-storage  0 (0%)     0 (0%)
Events:
Type     Reason                   Age    From        Message
----     ------                   ----   ----        -------
Warning  InvalidDiskCapacity      64m    kubelet     invalid capacity 0 on image filesystem
Normal   NodeHasSufficientMemory  64m    kubelet     Node cubie0 status is now: NodeHasSufficientMemory
Normal   NodeHasNoDiskPressure    64m    kubelet     Node cubie0 status is now: NodeHasNoDiskPressure
Normal   NodeHasSufficientPID     64m    kubelet     Node cubie0 status is now: NodeHasSufficientPID
Normal   NodeAllocatableEnforced  64m    kubelet     Updated Node Allocatable limit across pods
Normal   Starting                 63m    kube-proxy  Starting kube-proxy.
Normal   NodeHasNoDiskPressure    61m    kubelet     Node cubie0 status is now: NodeHasNoDiskPressure
Normal   NodeHasSufficientMemory  61m    kubelet     Node cubie0 status is now: NodeHasSufficientMemory
Normal   NodeHasSufficientPID     61m    kubelet     Node cubie0 status is now: NodeHasSufficientPID
Normal   Starting                 60m    kube-proxy  Starting kube-proxy.
Normal   NodeHasSufficientMemory  59m    kubelet     Node cubie0 status is now: NodeHasSufficientMemory
Normal   NodeHasNoDiskPressure    59m    kubelet     Node cubie0 status is now: NodeHasNoDiskPressure
Normal   NodeHasSufficientPID     59m    kubelet     Node cubie0 status is now: NodeHasSufficientPID
Normal   Starting                 59m    kube-proxy  Starting kube-proxy.
Warning  InvalidDiskCapacity      57m    kubelet     invalid capacity 0 on image filesystem
Normal   NodeHasSufficientMemory  57m    kubelet     Node cubie0 status is now: NodeHasSufficientMemory
Normal   NodeHasNoDiskPressure    57m    kubelet     Node cubie0 status is now: NodeHasNoDiskPressure
Normal   Starting                 57m    kube-proxy  Starting kube-proxy.
Normal   NodeHasNoDiskPressure    56m    kubelet     Node cubie0 status is now: NodeHasNoDiskPressure
Normal   NodeHasSufficientPID     56m    kubelet     Node cubie0 status is now: NodeHasSufficientPID
Normal   NodeHasSufficientMemory  56m    kubelet     Node cubie0 status is now: NodeHasSufficientMemory
Warning  InvalidDiskCapacity      56m    kubelet     invalid capacity 0 on image filesystem
Normal   NodeAllocatableEnforced  56m    kubelet     Updated Node Allocatable limit across pods
Normal   Starting                 55m    kube-proxy  Starting kube-proxy.
Warning  InvalidDiskCapacity      54m    kubelet     invalid capacity 0 on image filesystem
Warning  InvalidDiskCapacity      51m    kubelet     invalid capacity 0 on image filesystem
Normal   NodeHasSufficientMemory  51m    kubelet     Node cubie0 status is now: NodeHasSufficientMemory
Normal   Starting                 50m    kube-proxy  Starting kube-proxy.
Normal   NodeHasSufficientMemory  49m    kubelet     Node cubie0 status is now: NodeHasSufficientMemory
Normal   NodeHasNoDiskPressure    49m    kubelet     Node cubie0 status is now: NodeHasNoDiskPressure
Warning  InvalidDiskCapacity      49m    kubelet     invalid capacity 0 on image filesystem
Normal   Starting                 49m    kube-proxy  Starting kube-proxy.
Warning  InvalidDiskCapacity      47m    kubelet     invalid capacity 0 on image filesystem
Normal   NodeHasSufficientMemory  47m    kubelet     Node cubie0 status is now: NodeHasSufficientMemory
Normal   Starting                 47m    kube-proxy  Starting kube-proxy.
Normal   NodeHasSufficientMemory  46m    kubelet     Node cubie0 status is now: NodeHasSufficientMemory
Normal   NodeHasSufficientPID     46m    kubelet     Node cubie0 status is now: NodeHasSufficientPID
Normal   NodeHasNoDiskPressure    46m    kubelet     Node cubie0 status is now: NodeHasNoDiskPressure
Normal   NodeAllocatableEnforced  45m    kubelet     Updated Node Allocatable limit across pods
Normal   NodeHasSufficientMemory  44m    kubelet     Node cubie0 status is now: NodeHasSufficientMemory
Normal   NodeHasNoDiskPressure    44m    kubelet     Node cubie0 status is now: NodeHasNoDiskPressure
Normal   NodeHasSufficientPID     44m    kubelet     Node cubie0 status is now: NodeHasSufficientPID
Warning  InvalidDiskCapacity      44m    kubelet     invalid capacity 0 on image filesystem
Normal   Starting                 43m    kube-proxy  Starting kube-proxy.
Normal   NodeHasSufficientPID     42m    kubelet     Node cubie0 status is now: NodeHasSufficientPID
Warning  InvalidDiskCapacity      42m    kubelet     invalid capacity 0 on image filesystem
Normal   NodeHasSufficientMemory  42m    kubelet     Node cubie0 status is now: NodeHasSufficientMemory
Normal   NodeHasNoDiskPressure    42m    kubelet     Node cubie0 status is now: NodeHasNoDiskPressure
Normal   NodeAllocatableEnforced  42m    kubelet     Updated Node Allocatable limit across pods
Normal   NodeHasSufficientMemory  40m    kubelet     Node cubie0 status is now: NodeHasSufficientMemory
Normal   NodeHasNoDiskPressure    40m    kubelet     Node cubie0 status is now: NodeHasNoDiskPressure
Normal   NodeHasSufficientPID     40m    kubelet     Node cubie0 status is now: NodeHasSufficientPID
Warning  InvalidDiskCapacity      38m    kubelet     invalid capacity 0 on image filesystem
Normal   NodeHasSufficientMemory  38m    kubelet     Node cubie0 status is now: NodeHasSufficientMemory
Normal   NodeHasSufficientPID     38m    kubelet     Node cubie0 status is now: NodeHasSufficientPID
Normal   NodeHasNoDiskPressure    38m    kubelet     Node cubie0 status is now: NodeHasNoDiskPressure
Normal   NodeAllocatableEnforced  38m    kubelet     Updated Node Allocatable limit across pods
Normal   Starting                 38m    kube-proxy  Starting kube-proxy.
Normal   NodeHasSufficientMemory  35m    kubelet     Node cubie0 status is now: NodeHasSufficientMemory
Normal   NodeHasNoDiskPressure    35m    kubelet     Node cubie0 status is now: NodeHasNoDiskPressure
Normal   NodeHasSufficientPID     35m    kubelet     Node cubie0 status is now: NodeHasSufficientPID
Normal   NodeAllocatableEnforced  35m    kubelet     Updated Node Allocatable limit across pods
Normal   NodeHasSufficientPID     32m    kubelet     Node cubie0 status is now: NodeHasSufficientPID
Normal   NodeHasSufficientMemory  32m    kubelet     Node cubie0 status is now: NodeHasSufficientMemory
Normal   NodeHasNoDiskPressure    32m    kubelet     Node cubie0 status is now: NodeHasNoDiskPressure
Normal   NodeAllocatableEnforced  32m    kubelet     Updated Node Allocatable limit across pods
Warning  InvalidDiskCapacity      31m    kubelet     invalid capacity 0 on image filesystem
Normal   NodeHasSufficientMemory  31m    kubelet     Node cubie0 status is now: NodeHasSufficientMemory
Normal   NodeHasNoDiskPressure    31m    kubelet     Node cubie0 status is now: NodeHasNoDiskPressure
Normal   NodeHasSufficientPID     31m    kubelet     Node cubie0 status is now: NodeHasSufficientPID
Normal   NodeAllocatableEnforced  30m    kubelet     Updated Node Allocatable limit across pods
Normal   NodeHasSufficientPID     29m    kubelet     Node cubie0 status is now: NodeHasSufficientPID
Normal   NodeHasNoDiskPressure    29m    kubelet     Node cubie0 status is now: NodeHasNoDiskPressure
Normal   NodeHasSufficientMemory  29m    kubelet     Node cubie0 status is now: NodeHasSufficientMemory
Normal   NodeAllocatableEnforced  28m    kubelet     Updated Node Allocatable limit across pods
Normal   NodeHasSufficientMemory  27m    kubelet     Node cubie0 status is now: NodeHasSufficientMemory
Normal   NodeHasNoDiskPressure    27m    kubelet     Node cubie0 status is now: NodeHasNoDiskPressure
Normal   NodeHasSufficientPID     27m    kubelet     Node cubie0 status is now: NodeHasSufficientPID
Normal   NodeAllocatableEnforced  26m    kubelet     Updated Node Allocatable limit across pods
Normal   Starting                 26m    kube-proxy  Starting kube-proxy.
Normal   NodeHasSufficientPID     25m    kubelet     Node cubie0 status is now: NodeHasSufficientPID
Normal   NodeHasSufficientMemory  25m    kubelet     Node cubie0 status is now: NodeHasSufficientMemory
Normal   NodeHasNoDiskPressure    25m    kubelet     Node cubie0 status is now: NodeHasNoDiskPressure
Normal   NodeAllocatableEnforced  25m    kubelet     Updated Node Allocatable limit across pods
Normal   Starting                 24m    kube-proxy  Starting kube-proxy.
Normal   NodeHasSufficientMemory  23m    kubelet     Node cubie0 status is now: NodeHasSufficientMemory
Normal   NodeHasNoDiskPressure    23m    kubelet     Node cubie0 status is now: NodeHasNoDiskPressure
Normal   NodeHasSufficientPID     23m    kubelet     Node cubie0 status is now: NodeHasSufficientPID
Normal   NodeAllocatableEnforced  23m    kubelet     Updated Node Allocatable limit across pods
Normal   Starting                 22m    kube-proxy  Starting kube-proxy.
Normal   NodeHasSufficientPID     21m    kubelet     Node cubie0 status is now: NodeHasSufficientPID
Warning  InvalidDiskCapacity      21m    kubelet     invalid capacity 0 on image filesystem
Normal   NodeHasNoDiskPressure    21m    kubelet     Node cubie0 status is now: NodeHasNoDiskPressure
Normal   NodeHasSufficientMemory  21m    kubelet     Node cubie0 status is now: NodeHasSufficientMemory
Normal   Starting                 20m    kube-proxy  Starting kube-proxy.
Normal   NodeHasSufficientMemory  19m    kubelet     Node cubie0 status is now: NodeHasSufficientMemory
Normal   NodeHasNoDiskPressure    19m    kubelet     Node cubie0 status is now: NodeHasNoDiskPressure
Normal   NodeHasSufficientPID     19m    kubelet     Node cubie0 status is now: NodeHasSufficientPID
Normal   Starting                 19m    kube-proxy  Starting kube-proxy.
Normal   NodeHasSufficientMemory  17m    kubelet     Node cubie0 status is now: NodeHasSufficientMemory
Normal   NodeHasNoDiskPressure    17m    kubelet     Node cubie0 status is now: NodeHasNoDiskPressure
Normal   NodeHasSufficientPID     17m    kubelet     Node cubie0 status is now: NodeHasSufficientPID
Normal   NodeAllocatableEnforced  17m    kubelet     Updated Node Allocatable limit across pods
Normal   Starting                 17m    kube-proxy  Starting kube-proxy.
Warning  InvalidDiskCapacity      16m    kubelet     invalid capacity 0 on image filesystem
Normal   NodeHasNoDiskPressure    16m    kubelet     Node cubie0 status is now: NodeHasNoDiskPressure
Normal   NodeHasSufficientMemory  16m    kubelet     Node cubie0 status is now: NodeHasSufficientMemory
Normal   NodeHasSufficientPID     16m    kubelet     Node cubie0 status is now: NodeHasSufficientPID
Normal   NodeAllocatableEnforced  15m    kubelet     Updated Node Allocatable limit across pods
Normal   Starting                 15m    kube-proxy  Starting kube-proxy.
Normal   NodeHasSufficientMemory  13m    kubelet     Node cubie0 status is now: NodeHasSufficientMemory
Normal   NodeHasNoDiskPressure    13m    kubelet     Node cubie0 status is now: NodeHasNoDiskPressure
Normal   NodeHasSufficientPID     13m    kubelet     Node cubie0 status is now: NodeHasSufficientPID
Warning  InvalidDiskCapacity      13m    kubelet     invalid capacity 0 on image filesystem
Normal   NodeAllocatableEnforced  13m    kubelet     Updated Node Allocatable limit across pods
Normal   NodeHasNoDiskPressure    11m    kubelet     Node cubie0 status is now: NodeHasNoDiskPressure
Normal   NodeHasSufficientPID     11m    kubelet     Node cubie0 status is now: NodeHasSufficientPID
Normal   NodeHasSufficientMemory  11m    kubelet     Node cubie0 status is now: NodeHasSufficientMemory
Normal   NodeAllocatableEnforced  11m    kubelet     Updated Node Allocatable limit across pods
Normal   Starting                 11m    kube-proxy  Starting kube-proxy.
Warning  InvalidDiskCapacity      10m    kubelet     invalid capacity 0 on image filesystem
Normal   NodeHasSufficientMemory  10m    kubelet     Node cubie0 status is now: NodeHasSufficientMemory
Normal   NodeHasNoDiskPressure    10m    kubelet     Node cubie0 status is now: NodeHasNoDiskPressure
Normal   NodeHasSufficientPID     10m    kubelet     Node cubie0 status is now: NodeHasSufficientPID
Normal   NodeAllocatableEnforced  9m56s  kubelet     Updated Node Allocatable limit across pods
Normal   NodeHasSufficientMemory  8m11s  kubelet     Node cubie0 status is now: NodeHasSufficientMemory
Normal   NodeHasNoDiskPressure    8m11s  kubelet     Node cubie0 status is now: NodeHasNoDiskPressure
Normal   Starting                 7m37s  kube-proxy  Starting kube-proxy.
Normal   NodeHasSufficientPID     6m27s  kubelet     Node cubie0 status is now: NodeHasSufficientPID
Warning  InvalidDiskCapacity      6m27s  kubelet     invalid capacity 0 on image filesystem
Normal   NodeHasSufficientMemory  6m27s  kubelet     Node cubie0 status is now: NodeHasSufficientMemory
Normal   NodeHasNoDiskPressure    6m27s  kubelet     Node cubie0 status is now: NodeHasNoDiskPressure
Normal   NodeAllocatableEnforced  6m21s  kubelet     Updated Node Allocatable limit across pods
Normal   Starting                 5m51s  kube-proxy  Starting kube-proxy.
Warning  InvalidDiskCapacity      4m32s  kubelet     invalid capacity 0 on image filesystem
Normal   NodeHasSufficientMemory  4m32s  kubelet     Node cubie0 status is now: NodeHasSufficientMemory
Normal   NodeHasNoDiskPressure    4m32s  kubelet     Node cubie0 status is now: NodeHasNoDiskPressure
Normal   NodeHasSufficientPID     4m32s  kubelet     Node cubie0 status is now: NodeHasSufficientPID
Normal   NodeAllocatableEnforced  4m27s  kubelet     Updated Node Allocatable limit across pods
Warning  InvalidDiskCapacity      2m33s  kubelet     invalid capacity 0 on image filesystem
Normal   NodeHasSufficientMemory  2m33s  kubelet     Node cubie0 status is now: NodeHasSufficientMemory
Normal   NodeHasNoDiskPressure    2m33s  kubelet     Node cubie0 status is now: NodeHasNoDiskPressure
rancher-max commented 3 years ago

@brandond I'm moving this back to "Working" in case there's anything we find that may help, and as time permits I'll see if there's anything else I can do to reproduce this. My thought is finding a way to set the healthz timeout to be short enough for these not to finish in time and therefore fail. If I had to guess, I'm going to try with --shutdown-delay-duration as an apiserver arg and gradually increase it from 1 second just to see what happens.

brandond commented 3 years ago

@rancher-max lets keep this in to-verify until we have a solid way to reproduce this that doesn't involve running it on excessively resource constrained platforms that probably won't work anyway.

1998729 commented 3 years ago

I seem to have solved this problem, the memory and cpu limited by kubelet are too small

nirui commented 3 years ago

@1998729 Can you share a little bit more detail? I'm interested to test it at my end. Thanks!

1998729 commented 2 years ago

@1998729 Can you share a little bit more detail? I'm interested to test it at my end. Thanks!

kubelet systemd start params disabled cpu and memory limits

Dsafe1 commented 2 years ago

I just wanted to report my case. I have had a resource constrained node (2GB RAM, 1.6GHz N2600) on which this problem would appear very often, but sometimes fix itself. After upgrading to 1.21 it works without problem.

nirui commented 2 years ago

After upgrading to 1.21 it works without problem.

That sounds great. Unfortunately I don't have devices to test it anymore, my old boards... let's just say some of them don't have RAM chips on them anymore (so as a few solder pads for those chips).

Can we just assume this problem is solved until new issue arises?

Thanks!