Open imatespl opened 5 years ago
docker-runc -version runc version 1.0.0-rc5+dev commit: 69663f0bd4b60df09991c08812a60108003fa340 spec: 1.0.0
That looks like an XFS bug to me and I would suggest reporting it to CentOS, it's happening when we are creating a new mount namespaces with unshare(CLONE_NEWNS)
.
I hit the same issue with you guys. The machine is my Kubernetes worker node. The node with Red Hat Enterprise Linux Server 7.5 (Maipo) 3.10.0-1062.el7.x86_64 docker://19.3.2
. And this issue can let PLEG
and Kubelet
stop work.
runc version 1.0.0-rc8
commit: 425e105d5a03fabd737a126ad93d62a9eeede87f
spec: 1.0.1-dev
I've also run into this issue; similar to @Aisuko, it presented on my Kubernetes worker node, which also showed PLEG and Kubelet failures. Node is running RHEL 7.6, Docker 18.09.9.
Has anyone found the reason? I have the same issue with kubernetes 1.16.3, docker version 19.03.3, and containerd 1.2.10, nvidia 1.0.0-rc8+dev, docker-init 0.18.0
@strgrb I've run into this issue as well, it looks like it is fixed in newer kernel versions, and may be related to https://github.com/opencontainers/runc/issues/1725 and https://bugzilla.redhat.com/show_bug.cgi?id=1507149
What OS and OS version are you running?
@ddl-rolandsugars I use centos7.6 and kernel version is 3.10.0-957. I don't think my problem is related to #1725 because I can't see kernel messages like 'SLUB: Unable to allocate memory on node'. I set vm.lowmem_reserve_ratio="1 256 32" to reserve more memory for dma, and I have not seen this error for several weeks. But I don't know whether this is a correct solution.
@strgrb What is the storage device you're using?
@ddl-rolandsugars An ssd for /
and another ssd for /var
on some machine
@strgrb my bad, I meant storage driver, if you run docker info
it should tell you. I think you're probably using devicemapper
?
Example output:
$ docker info
Client:
Debug Mode: false
Server:
Containers: 2
Running: 0
Paused: 0
Stopped: 2
Images: 5
Server Version: 19.03.13
Storage Driver: overlay2 <= this.
Backing Filesystem: extfs
Supports d_type: true
Native Overlay Diff: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
Volume: local
Network: bridge host ipvlan macvlan null overlay
Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 8fba4e9a7d01810a393d5d25a3621dc101981175
runc version: dc9208a3303feef5b3839f4323d9beb36df0a9dd
init version: fec3683
Security Options:
seccomp
Profile: default
Kernel Version: 4.19.76-linuxkit
Operating System: Docker Desktop
OSType: linux
Architecture: x86_64
CPUs: 4
Total Memory: 1.944GiB
Name: docker-desktop
ID: RMQE:67ZV:WKCO:PNIS:FD2M:ON2P:HVYC:DSLI:5S7R:NEBG:RVDX:XTG7
Docker Root Dir: /var/lib/docker
Debug Mode: false
HTTP Proxy: gateway.docker.internal:3128
HTTPS Proxy: gateway.docker.internal:3129
Registry: https://index.docker.io/v1/
Labels:
Experimental: false
Insecure Registries:
127.0.0.0/8
Live Restore Enabled: false
Product License: Community Engine
``
@ddl-rolandsugars My storage driver is overlay2
Update kernel to 3.10.0-1062.el7.x86_64, and disable kmem account, add cgroup.memory=nokmem to boot cmdline also see https://access.redhat.com/solutions/532663
docker-runc init failed in console loop print XFS: runc:1:CHILD possible memory allocation deadlock in kmem_zone_alloc (mode:0x82d0) cat /proc/3580/stack [] congestion_wait+0x82/0x110
[] kmem_zone_alloc+0x8c/0x130 [xfs]
[] xfs_trans_alloc+0x6d/0x140 [xfs]
[] xfs_inactive_ifree+0x55/0x230 [xfs]
[] xfs_inactive+0x8b/0x130 [xfs]
[] xfs_fs_destroy_inode+0x95/0x190 [xfs]
[] destroy_inode+0x3b/0x60
[] evict+0x115/0x180
[] iput+0xfc/0x190
[] __dentry_kill+0x120/0x180
[] dput+0xb0/0x160
[] drop_mountpoint+0x16/0x30
[] pin_kill+0x7d/0x100
[] group_pin_kill+0x21/0x30
[] namespace_unlock+0x71/0x80
[] drop_collected_mounts+0x54/0x60
[] put_mnt_ns+0x24/0x30
[] create_new_namespaces+0x165/0x180
[] unshare_nsproxy_namespaces+0x5a/0xc0
[] SyS_unshare+0x173/0x2e0
[] system_call_fastpath+0x22/0x27
[] 0xffffffffffffffff
the memory use low
Tasks: 240 total, 1 running, 239 sleeping, 0 stopped, 0 zombie
%Cpu(s): 1.1 us, 0.8 sy, 0.0 ni, 73.5 id, 24.5 wa, 0.0 hi, 0.0 si, 0.0 st
KiB Mem : 32780772 total, 12619844 free, 16482460 used, 3678468 buff/cache
KiB Swap: 0 total, 0 free, 0 used. 10126064 avail Mem
ps -aux --forest root 3558 0.0 0.0 7488 2804 ? Sl Apr02 0:21 docker-containerd-shim -namespace moby -workdir /var/lib/docker/containerd/daemo root 3572 0.0 0.0 138832 7832 ? Sl Apr02 0:00 _ docker-runc --root /var/run/docker/runtime-runc/moby --log /run/docker/conta root 3579 0.0 0.0 18388 4348 ? S Apr02 0:00 _ docker-runc init root 3580 1.3 0.0 18388 2384 ? D Apr02 197:00 _ docker-runc init
system 3.10.0-957.5.1.el7.x86_64 #1 SMP Fri Feb 1 14:54:57 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux