rancher / k3os

Purpose-built OS for Kubernetes, fully managed by Kubernetes.
https://k3os.io
Apache License 2.0
3.5k stars 404 forks source link

Not possible to create new pods after running for a while #531

Open liyimeng opened 4 years ago

liyimeng commented 4 years ago

Version (k3OS / kernel)

k3os version v0.11.0-m2-1

5.4.0-37-generic #41 SMP Mon Jun 22 16:24:04 UTC 2020

Architecture

amd64 Describe the bug

This is actullly reported in https://github.com/rancher/k3s/issues/1962 since it report unable to remove old pods with "Device is busy", it might be k3os issue, I raise it here for attention.

To Reproduce

Expected behavior

Actual behavior

Additional context

dweomer commented 4 years ago

@liyimeng I have only seen similar issues with overlay installations (an all architectures). Is yours such?

liyimeng commented 4 years ago

@dweomer I k3os installed to a dedicated disk. However, I have the /var/lib/rancher/k3s/server is on anther disk, which I use to persistent cluster data in case I have to re-installed, and I think it should not cause this problem. Do you agree?

liyimeng commented 4 years ago

I happen to have a node has this problem right now. However, for this node I have downgrade k3s to v1.17.5+k3s1

uname -a Linux rndmaster 5.4.0-29-generic #33 SMP Thu May 7 12:41:21 UTC 2020 x86_64 GNU/Linux

sudo mount

.... (cut off many lines)

shm on /run/k3s/containerd/io.containerd.grpc.v1.cri/sandboxes/a91f56cdac6174896c93c0ef7559de668c0177841832d1a7db4015223b06e955/shm type tmpfs (rw,nosuid,nodev,noexec,relatime,size=65536k)
shm on /run/k3s/containerd/io.containerd.grpc.v1.cri/sandboxes/d14de9925c179a3feb906004dafb0e02d291cf9762035cc65cd610aef66988eb/shm type tmpfs (rw,nosuid,nodev,noexec,relatime,size=65536k)
shm on /dev/shm type tmpfs (rw,nosuid,nodev,noexec,relatime,size=65536k)
shm on /run/k3s/containerd/io.containerd.grpc.v1.cri/sandboxes/a91f56cdac6174896c93c0ef7559de668c0177841832d1a7db4015223b06e955/shm type tmpfs (rw,nosuid,nodev,noexec,relatime,size=65536k)
shm on /run/k3s/containerd/io.containerd.grpc.v1.cri/sandboxes/d14de9925c179a3feb906004dafb0e02d291cf9762035cc65cd610aef66988eb/shm type tmpfs (rw,nosuid,nodev,noexec,relatime,size=65536k)
shm on /dev/shm type tmpfs (rw,nosuid,nodev,noexec,relatime,size=65536k)
shm on /run/k3s/containerd/io.containerd.grpc.v1.cri/sandboxes/a91f56cdac6174896c93c0ef7559de668c0177841832d1a7db4015223b06e955/shm type tmpfs (rw,nosuid,nodev,noexec,relatime,size=65536k)
shm on /run/k3s/containerd/io.containerd.grpc.v1.cri/sandboxes/d14de9925c179a3feb906004dafb0e02d291cf9762035cc65cd610aef66988eb/shm type tmpfs (rw,nosuid,nodev,noexec,relatime,size=65536k)
shm on /dev/shm type tmpfs (rw,nosuid,nodev,noexec,relatime,size=65536k)
shm on /run/k3s/containerd/io.containerd.grpc.v1.cri/sandboxes/a91f56cdac6174896c93c0ef7559de668c0177841832d1a7db4015223b06e955/shm type tmpfs (rw,nosuid,nodev,noexec,relatime,size=65536k)
shm on /run/k3s/containerd/io.containerd.grpc.v1.cri/sandboxes/d14de9925c179a3feb906004dafb0e02d291cf9762035cc65cd610aef66988eb/shm type tmpfs (rw,nosuid,nodev,noexec,relatime,size=65536k)

... cut off many lines 

counting the total mounts


 sudo mount | wc -l
65637

Run a df command:


Filesystem           Inodes  IUsed       IFree IUse% Mounted on
/dev/sda1          12737680 159104    12578576    2% /
/dev/loop1            10358  10358           0  100% /usr
none               16458170    522    16457648    1% /etc
tmpfs              16458170    940    16457230    1% /run
tmpfs              16458170      3    16458167    1% /tmp
dev                16445213    632    16444581    1% /dev
shm                16458170      1    16458169    1% /dev/shm
cgroup_root        16458170     17    16458153    1% /sys/fs/cgroup
/dev/loop2            21001  21001           0  100% /usr/src
/dev/sdb1 27718900127     87 27718900040    1% /var/lib/rancher/k3s/server
/dev/sdb2  27718900637    597 27718900040    1% /var/lib/rancher/k3s/storage
tmpfs              16458170      9    16458161    1% /var/lib/kubelet/pods/cf488e9f-fbde-4b0a-801d-d1d623ebe487/volumes/kubernetes.io~secret/metrics-server-token-nl69b
tmpfs              16458170      9    16458161    1% /var/lib/kubelet/pods/4d220be5-3e74-432c-ba3a-980be9e709e0/volumes/kubernetes.io~secret/coredns-token-7kvkh
tmpfs              16458170      9    16458161    1% /var/lib/kubelet/pods/91ffbef7-79fc-4bc2-a810-13a8872c4c03/volumes/kubernetes.io~secret/k3os-upgrade-token-94n2t
tmpfs              16458170      9    16458161    1% /var/lib/kubelet/pods/a5eee274-0866-4128-abfe-19931cef366d/volumes/kubernetes.io~secret/default-token-8fbkd
shm                16458170      1    16458169    1% /run/k3s/containerd/io.containerd.grpc.v1.cri/sandboxes/a54006b950dda1911566593fe5c841251c2d5dac2bfba921ee08d68530ff05a5/shm
shm                16458170      1    16458169    1% /run/k3s/containerd/io.containerd.grpc.v1.cri/sandboxes/6bc2c506214bef31bae30320c76066fdbd2c32c769a6302ff2377933b7b4df9d/shm
shm                16458170      1    16458169    1% /run/k3s/containerd/io.containerd.grpc.v1.cri/sandboxes/07548378f963a075f49c76421ac3d55ef87af814c62a090df481be0921ca5993/shm
overlay            12737680 159104    12578576    2% /run/k3s/containerd/io.containerd.runtime.v2.task/k8s.io/a54006b950dda1911566593fe5c841251c2d5dac2bfba921ee08d68530ff05a5/rootfs
overlay            12737680 159104    12578576    2% /run/k3s/containerd/io.containerd.runtime.v2.task/k8s.io/6bc2c506214bef31bae30320c76066fdbd2c32c769a6302ff2377933b7b4df9d/rootfs
overlay            12737680 159104    12578576    2% /run/k3s/containerd/io.containerd.runtime.v2.task/k8s.io/07548378f963a075f49c76421ac3d55ef87af814c62a090df481be0921ca5993/rootfs
shm                16458170      1    16458169    1% /run/k3s/containerd/io.containerd.grpc.v1.cri/sandboxes/f1fa7cb8d06c846ec1cc6fa8fdb66441246e81e5a4528cc04aea84c2a72219f9/shm
overlay            12737680 159104    12578576    2% /run/k3s/containerd/io.containerd.runtime.v2.task/k8s.io/f1fa7cb8d06c846ec1cc6fa8fdb66441246e81e5a4528cc04aea84c2a72219f9/rootfs
overlay            12737680 159104    12578576    2% /run/k3s/containerd/io.containerd.runtime.v2.task/k8s.io/cc318fb81ff68a58a933fbd5f5ce00adecbe0e805baea17b205f1e9015439bdc/rootfs
overlay            12737680 159104    12578576    2% /run/k3s/containerd/io.containerd.runtime.v2.task/k8s.io/ac9fcc8075a5afd1873e59c0b52e989995caa29e58e19ef08a84295fe70957cf/rootfs
overlay            12737680 159104    12578576    2% /run/k3s/containerd/io.containerd.runtime.v2.task/k8s.io/6c501137f9c1bbfa46e8775011da579c36c35aee168b41fdd764c67a92d98668/rootfs
overlay            12737680 159104    12578576    2% /run/k3s/containerd/io.containerd.runtime.v2.task/k8s.io/f7d141ea68c7b0651d34674ef0997767ad7d5c6dfe7176be79763fa45ab54d3f/rootfs
overlay            12737680 159104    12578576    2% /run/k3s/containerd/io.containerd.runtime.v2.task/k8s.io/1b02ad7f33c93c0c19d39a07d4d14b976f940f9ddae61fe8391608f5430a9e0e/rootfs
tmpfs              16458170      9    16458161    1% /var/lib/kubelet/pods/c4e0d260-30be-41d8-9689-026170135cff/volumes/kubernetes.io~secret/default-token-xc4lp
overlay            12737680 159104    12578576    2% /run/k3s/containerd/io.containerd.runtime.v2.task/k8s.io/d14de9925c179a3feb906004dafb0e02d291cf9762035cc65cd610aef66988eb/rootfs
tmpfs              16458170      9    16458161    1% /var/lib/kubelet/pods/817d033f-7a51-4a15-832d-7f732c4d5fcd/volumes/kubernetes.io~secret/default-token-xc4lp
overlay            12737680 159104    12578576    2% /run/k3s/containerd/io.containerd.runtime.v2.task/k8s.io/a91f56cdac6174896c93c0ef7559de668c0177841832d1a7db4015223b06e955/rootfs
overlay            12737680 159104    12578576    2% /run/k3s/containerd/io.containerd.runtime.v2.task/k8s.io/661c2eccdea755ffe585d16d1f52a8d3a62ac40bcedd586cb69d0d69d70e947a/rootfs
overlay            12737680 159104    12578576    2% /run/k3s/containerd/io.containerd.runtime.v2.task/k8s.io/f14c4635c2c4483b1a3a24d261d8679120d83fb8861f7f12b693ebcea74007c7/rootfs
tmpfs              16458170      9    16458161    1% /var/lib/kubelet/pods/fceccb7b-de89-4e91-a556-481d679ebf84/volumes/kubernetes.io~secret/default-token-xc4lp
overlay            12737680 159104    12578576    2% /run/k3s/containerd/io.containerd.runtime.v2.task/k8s.io/add61ef916e243be1d3f2973090b30631fbfede3bf87224a1700b083a4ae6d9c/rootfs
overlay            12737680 159104    12578576    2% /run/k3s/containerd/io.containerd.runtime.v2.task/k8s.io/a1b258df0c24cca6eec3ae71fc910f7545d050ea0d6f46a6942bec6d7328a696/rootfs

i.e. I have 65637 of mounts at the system, which make the fs become unusable. I am wondering if I have the same issue like https://github.com/rancher/k3os/issues/498

And to me, it is very strange for the last line in this output

sudo mount | grep sda
/dev/sda1 on / type ext4 (rw,relatime)
/dev/sda1 on /boot type ext4 (rw,relatime)
/dev/sda1 on /k3os/system type ext4 (ro,relatime)
/dev/sda1 on /var/lib/kubelet/pods/a5eee274-0866-4128-abfe-19931cef366d/volume-subpaths/ui-config/ui/0 type ext4 (rw,relatime)
liyimeng commented 4 years ago

and I don't have many pods running:

` /run/k3s/containerd/io.containerd.grpc.v1.cri/sandboxes]# ls 07548378f963a075f49c76421ac3d55ef87af814c62a090df481be0921ca5993 c4ae71484288dfe32db20d8e8b358842f37243715d7b46b4cb7c2cad54c28cc5 6bc2c506214bef31bae30320c76066fdbd2c32c769a6302ff2377933b7b4df9d d14de9925c179a3feb906004dafb0e02d291cf9762035cc65cd610aef66988eb a54006b950dda1911566593fe5c841251c2d5dac2bfba921ee08d68530ff05a5 d3ec208106a96753fa919e7d588b1996d211e3d357d23e9850bf2cb02803fecb a91f56cdac6174896c93c0ef7559de668c0177841832d1a7db4015223b06e955 f1fa7cb8d06c846ec1cc6fa8fdb66441246e81e5a4528cc04aea84c2a72219f9 add61ef916e243be1d3f2973090b30631fbfede3bf87224a1700b083a4ae6d9c

`

liyimeng commented 4 years ago

This seem very similar https://forum.storj.io/t/weird-behavior-cannot-connect-to-the-docker-daemon-at-unix-var-run-docker-sock-is-the-docker-daemon-running/4854/4