docker / for-linux

Docker Engine for Linux
https://docs.docker.com/engine/installation/
755 stars 86 forks source link

Running container fails with failed to add the host cannot allocate memory #1443

Open hpakniamina opened 2 years ago

hpakniamina commented 2 years ago

OS: Red Hat Enterprise Linux release 8.7 (Ootpa) Version:

$ sudo yum list installed | grep docker
containerd.io.x86_64                         1.6.9-3.1.el8                               @docker-ce-stable
docker-ce.x86_64                             3:20.10.21-3.el8                            @docker-ce-stable
docker-ce-cli.x86_64                         1:20.10.21-3.el8                            @docker-ce-stable
docker-ce-rootless-extras.x86_64             20.10.21-3.el8                              @docker-ce-stable
docker-scan-plugin.x86_64                    0.21.0-3.el8                                @docker-ce-stable

Out of hundreds os docker calls made over days, a few of them fails. This is the schema of the commandline:

/usr/bin/docker run \
-u 1771:1771 \
-a stdout \
-a stderr \
-v /my_path:/data \
--rm \
my_image:latest my_entry --my_args

The failure:

docker: Error response from daemon: failed to create endpoint recursing_aryabhata on network bridge: failed to add the host (veth6ad97f8) <=> sandbox (veth23b66ce) pair interfaces: cannot allocate memory.

It is not easily reproducible. The failure rate is less than one percent. At the time this error happens system has lots of free memory. Around the time that this failure happens, the application is making around 5 docker calls per second. Each call take about 5 to 10 seconds to complete.

petergerten commented 1 year ago

I have the same issue on Arch, also not consistently reproducible. Docker version 20.10.23, build 715524332f

hpakniamina commented 1 year ago

I have the same issue on Arch, also not consistently reproducible. Docker version 20.10.23, build 715524332f

I did not need the networking features of the container, therefore passing "--network none" to docker run commandline circumvented the problem:

docker run ... --network none ...
henryborchers commented 1 year ago

It's happening to me when I am building my images. Sadly, it too is not able to be reproduced consistently.

docker build ...

nixon89 commented 1 year ago

I have the same behavior with docker build command (cannot allocate memory)

# docker version

Client: Docker Engine - Community
 Version:           23.0.0
 API version:       1.42
 Go version:        go1.19.5
 Git commit:        e92dd87
 Built:             Wed Feb  1 17:47:51 2023
 OS/Arch:           linux/amd64
 Context:           default

Server: Docker Engine - Community
 Engine:
  Version:          23.0.0
  API version:      1.42 (minimum version 1.12)
  Go version:       go1.19.5
  Git commit:       d7573ab
  Built:            Wed Feb  1 17:47:51 2023
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          1.6.16
  GitCommit:        31aa4358a36870b21a992d3ad2bef29e1d693bec
 runc:
  Version:          1.1.4
  GitCommit:        v1.1.4-0-g5fd4c4d
 docker-init:
  Version:          0.19.0
  GitCommit:        de40ad0
# apt list --installed | grep docker

docker-buildx-plugin/jammy,now 0.10.2-1~ubuntu.22.04~jammy amd64 [installed,automatic]
docker-ce-cli/jammy,now 5:23.0.0-1~ubuntu.22.04~jammy amd64 [installed]
docker-ce-rootless-extras/jammy,now 5:23.0.0-1~ubuntu.22.04~jammy amd64 [installed,automatic]
docker-ce/jammy,now 5:23.0.0-1~ubuntu.22.04~jammy amd64 [installed]
docker-compose-plugin/jammy,now 2.15.1-1~ubuntu.22.04~jammy amd64 [installed,automatic]
docker-scan-plugin/jammy,now 0.23.0~ubuntu-jammy amd64 [installed,automatic]
hostalp commented 1 year ago

Exactly the same issue here during docker build, Rocky Linux 8.7 (RHEL 8.7 clone), Docker 20.10.22-3.el8

b-khouy commented 1 year ago

I fixed the prob using docker builder prune command then run the build again https://docs.docker.com/engine/reference/commandline/builder_prune

hpakniamina commented 1 year ago

I fixed the prob using docker builder prune command then run the build again https://docs.docker.com/engine/reference/commandline/builder_prune

If one is dealing with an intermittent problem, then there is no guarantee the issue is resolved.

bendem commented 1 year ago

Same problem here, every x times, a build fails with failed to add the host ( ) <=> sandbox ( ) pair interfaces: cannot allocate memory.. System info :


$ dnf list --installed docker\* containerd\* | cat
Installed Packages
containerd.io.x86_64                    1.6.20-3.1.el8         @docker-ce-stable
docker-buildx-plugin.x86_64             0.10.4-1.el8           @docker-ce-stable
docker-ce.x86_64                        3:23.0.2-1.el8         @docker-ce-stable
docker-ce-cli.x86_64                    1:23.0.2-1.el8         @docker-ce-stable
docker-ce-rootless-extras.x86_64        23.0.2-1.el8           @docker-ce-stable
docker-compose-plugin.x86_64            2.17.2-1.el8           @docker-ce-stable
docker-scan-plugin.x86_64               0.23.0-3.el8           @docker-ce-stable

$ sudo docker info
Client:
 Context:    default
 Debug Mode: false
 Plugins:
  buildx: Docker Buildx (Docker Inc.)
    Version:  v0.10.4
    Path:     /usr/libexec/docker/cli-plugins/docker-buildx
  compose: Docker Compose (Docker Inc.)
    Version:  v2.17.2
    Path:     /usr/libexec/docker/cli-plugins/docker-compose
  scan: Docker Scan (Docker Inc.)
    Version:  v0.23.0
    Path:     /usr/libexec/docker/cli-plugins/docker-scan

Server:
 Containers: 0
  Running: 0
  Paused: 0
  Stopped: 0
 Images: 55
 Server Version: 23.0.2
 Storage Driver: overlay2
  Backing Filesystem: xfs
  Supports d_type: true
  Using metacopy: false
  Native Overlay Diff: true
  userxattr: false
 Logging Driver: json-file
 Cgroup Driver: cgroupfs
 Cgroup Version: 1
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: io.containerd.runc.v2 runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 2806fc1057397dbaeefbea0e4e17bddfbd388f38
 runc version: v1.1.5-0-gf19387a
 init version: de40ad0
 Security Options:
  seccomp
   Profile: builtin
 Kernel Version: 4.18.0-425.13.1.el8_7.x86_64
 Operating System: Rocky Linux 8.7 (Green Obsidian)
 OSType: linux
 Architecture: x86_64
 CPUs: 4
 Total Memory: 15.4GiB
 Name: x
 ID: NUAJ:VDZR:RMDC:ASCP:5SEG:D4EF:OEIW:RY57:VXYI:5EZV:6F4F:D5RO
 Docker Root Dir: /opt/docker_data
 Debug Mode: false
 Registry: https://index.docker.io/v1/
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Registry Mirrors:
  https://x/
 Live Restore Enabled: false
 Default Address Pools:
   Base: 172.17.0.0/16, Size: 24
   Base: 172.20.0.0/16, Size: 24
   Base: 172.30.0.0/16, Size: 24
bendem commented 1 year ago

If I understand correctly, this is the same as https://bbs.archlinux.org/viewtopic.php?id=282429 which is fixed by this patch queued here.

henryborchers commented 1 year ago

I don't know if this helps but it's happening to me on Rocky Linux 8.7 as well, just like @hostalp.

pschoen-itsc commented 1 year ago

We have the same issue on Ubuntu 20.04 since a few weeks.

thaJeztah commented 1 year ago

/cc @akerouanton FYI (I see a potential kernel issue mentioned above)

pschoen-itsc commented 1 year ago

We have the problem with an older kernel (5.15), so I do not think that there is a connection with the mentioned kernel bug.

XuNiLuS commented 1 year ago

I have the same problem with a debian 12 (6.1.0-9-amd64), but no problem with a debian 11 (5.10.0-21-amd64)

utrotzek commented 1 year ago

Same Problem on Ubuntu 22.04

Linux 5.15.0-76-generic #83-Ubuntu SMP Thu Jun 15 19:16:32 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux

This error is really annoying since all of our CI/CD Pipelines fail randomly.

skast96 commented 1 year ago

Same problem on Ubuntu 22.04 on a server

WolleTD commented 1 year ago

I have the same problem with a debian 12 (6.1.0-9-amd64), but no problem with a debian 11 (5.10.0-21-amd64)

Same here. Moved a server to Debian 12 which is starting a new container once per day to create backups of docker volumes. After some days or weeks, starting the container fails until I restart the docker daemon.

skast96 commented 1 year ago

I compared three of my servers with the failing one and the only thing that is different is that the vm.swappiness is set to 0 and that the server has no swap activated at all. If that helps

pschoen-itsc commented 1 year ago

I compared three of my servers with the failing one and the only thing that is different is that the vm.swappiness is set to 0 and that the server has no swap activated at all. If that helps

We disabled swap on our failing servers, but it did not help.

skast96 commented 1 year ago

I compared three of my servers with the failing one and the only thing that is different is that the vm.swappiness is set to 0 and that the server has no swap activated at all. If that helps

We disabled swap on our failing servers, but it did not help.

I thought of the other way around and wanted to check if enabling the swap helps πŸ€”

utrotzek commented 1 year ago

I compared three of my servers with the failing one and the only thing that is different is that the vm.swappiness is set to 0 and that the server has no swap activated at all. If that helps

That was EXACTLY the case on our servers.

Changing the values in /etc/sysctl.conf

from

vm.swappiness = 0

to

vm.swappiness = 60

and applying it with sysctl -p

solved it for. You saved my life! ;) I forgot that I had set this value.

skast96 commented 1 year ago

Did anyone fix that error without enabling the swap because on that specific server it is not possible to enable the swap....

pschoen-itsc commented 1 year ago

Configured swap as suggested here (disabled image pruning before each run) and the error appeared again after a few days.

skast96 commented 1 year ago

I am still trying to fix this on Ubuntu 22.04 without a swap. My next guess is that i miss configured something in my compose files, which leads to a high number of network connections left open?!? I am not sure if that fixes that problem or if it really is due to a kernel error. I will report my findings here next week. If anyone has figured it out please feel free to comment.

hpakniamina commented 1 year ago

My next guess is that i miss configured something in my compose files

As mentioned before, we did not need the networking, so "--network none" helped us to go around it. We don't have docker compose. We simply call docker a couple of thousand times. Docker container reads the input and writes into the output and the container is removed by "--rm". Our issue does not have anything to do with weird configurations or docker compose.

AmirL commented 1 year ago

Have the same problem. vm.swappiness = 60 helped for a while, but now the problem is back again.

cimchd commented 1 year ago

Same problem here. We have 89 services in our compose file. Running docker system prune before the build usually solves the problem temporary.

bendem commented 1 year ago

The problem being seemingly random, I don't think changing vm.swapiness actually fixed the problem, it went away like it generally does and came back later. Some weeks we see this error every day, some weeks it doesn't happen at all.

vasugoriya commented 1 year ago

This can happen if your system is running low on memory, or if the Docker daemon is configured to use a large amount of memory.

Increase the amount of memory that the Docker daemon is allowed to use. You can do this by editing the dockerd configuration file. The default setting is 2 gigabytes

Cleanup Docker Resources Docker may have leftover resources that are not properly cleaned up. You can remove unused containers, images, and volumes to free up resources

pschoen-itsc commented 1 year ago

This can happen if your system is running low on memory, or if the Docker daemon is configured to use a large amount of memory.

Increase the amount of memory that the Docker daemon is allowed to use. You can do this by editing the dockerd configuration file. The default setting is 2 gigabytes

Cleanup Docker Resources Docker may have leftover resources that are not properly cleaned up. You can remove unused containers, images, and volumes to free up resources

I really don't think that this is the issue here. Our systems have more than enough resources and there is no limit set for dockerd.

jingzhaoyang commented 1 year ago

I'm having the same issue

dpkg-deb (subprocess): decompressing archive '/var/cache/apt/archives/strace_6.1-0.1_amd64.deb' (size=1313880) member 'control.tar': lzma error: Cannot allocate memory
tar: This does not look like a tar archive
tar: Exiting with failure status due to previous errors
dpkg-deb: error: tar subprocess returned error exit status 2
dpkg: error processing archive /var/cache/apt/archives/strace_6.1-0.1_amd64.deb (--unpack):
 dpkg-deb --control subprocess returned error exit status 2
Errors were encountered while processing:
 /var/cache/apt/archives/libunwind8_1.6.2-3_amd64.deb
 /var/cache/apt/archives/strace_6.1-0.1_amd64.deb
E: Sub-process /usr/bin/dpkg returned an error code (1)
E: Problem executing scripts DPkg::Post-Invoke 'rm -f /var/cache/apt/archives/*.deb /var/cache/apt/archives/partial/*.deb /var/cache/apt/*.bin || true'
E: Sub-process returned an error code

However, this is the latest call of Debian:BookWorm that is an older version of Docker and does not support it But it was solved by modifying the configuration, and at run is added: --security-opt seccomp=unconfined

bendem commented 1 year ago

I'm having the same issue

dpkg-deb (subprocess): decompressing archive '/var/cache/apt/archives/strace_6.1-0.1_amd64.deb' (size=1313880) member 'control.tar': lzma error: Cannot allocate memory
tar: This does not look like a tar archive
tar: Exiting with failure status due to previous errors
dpkg-deb: error: tar subprocess returned error exit status 2
dpkg: error processing archive /var/cache/apt/archives/strace_6.1-0.1_amd64.deb (--unpack):
 dpkg-deb --control subprocess returned error exit status 2
Errors were encountered while processing:
 /var/cache/apt/archives/libunwind8_1.6.2-3_amd64.deb
 /var/cache/apt/archives/strace_6.1-0.1_amd64.deb
E: Sub-process /usr/bin/dpkg returned an error code (1)
E: Problem executing scripts DPkg::Post-Invoke 'rm -f /var/cache/apt/archives/*.deb /var/cache/apt/archives/partial/*.deb /var/cache/apt/*.bin || true'
E: Sub-process returned an error code

However, this is the latest call of Debian:BookWorm that is an older version of Docker and does not support it But it was solved by modifying the configuration, and at run is added: --security-opt seccomp=unconfined

Your error is from a program inside the container (apparently due to seccomp, or corrupted download, packaging or memory issue), this issue is about an error in docker itself.

henryborchers commented 1 year ago

Is anybody working on the docker team looking at this thread or is this just an echo chamber?

pschoen-itsc commented 1 year ago

@henryborchers I'm not sure. For better visibility I also opened the issue here. But there is also no activity aside from some tags assigned by @thaJeztah like in this issue.

skast96 commented 1 year ago

For reference: It has been a few weeks and I have been monitoring my "ill" server that has had that problem. It has swarm enabled and I have been tweaking with my docker compose network sections. I have a standard network for each compose which is named "internal" where only the services of that specific compose file are connected. I also named those "internal". These are deployed in different stacks, however they have all had the same name. Docker never wrote any error when deploying the stacks and to my fault I did not notice that there was 1 internal network for all compose files and not many. I think that docker has had some trouble creating that network in the background which led to a recreation every time I deployed one of those stacks, although there was already a network named "internal". I noticed that a few weeks ago and gave all internal networks a new name. I also did not update docker for a few days just to be sure it came from the wrong network settings. Since then I haven't had the error.

I hope this helps anyone.

pschoen-itsc commented 1 year ago

For reference: It has been a few weeks and I have been monitoring my "ill" server that has had that problem. It has swarm enabled and I have been tweaking with my docker compose network sections. I have a standard network for each compose which is named "internal" where only the services of that specific compose file are connected. I also named those "internal". These are deployed in different stacks, however they have all had the same name. Docker never wrote any error when deploying the stacks and to my fault I did not notice that there was 1 internal network for all compose files and not many. I think that docker has had some trouble creating that network in the background which led to a recreation every time I deployed one of those stacks, although there was already a network named "internal". I noticed that a few weeks ago and gave all internal networks a new name. I also did not update docker for a few days just to be sure it came from the wrong network settings. Since then I haven't had the error.

I hope this helps anyone.

Interesting finding. We do not specify any networks, but also we have the error during the build so that shouldn't make a difference. Seems like there are multiple sources for this error.

JonasAlfredsson commented 1 year ago

I am trying to debug this issue as well after we started seeing this on servers upgraded from Ubuntu 20 (5.4.0-105-generic) to Ubuntu 22 (5.15.0-86-generic). We have not noticed this error on the old machines, but suddenly we see it in multiple places where the common denominator is the newer OS.

The error is basically what has been mentioned before:

Error response from daemon: failed to create endpoint container.name on network bridge: failed to add the host (veth6bff3fa) <=> sandbox (vethed97567) pair interfaces: cannot allocate memory

and looking in the journal we see messages from the kernel like this:

Click for log ```txt Jun 07 11:29:42 hostname systemd[1]: Started libcontainer container 156ee6694c7463879ae9786bce73fdcada504b1f79eff9d4d521ded6f4ccc80c. Jun 07 11:29:42 hostname kernel: warn_alloc: 2 callbacks suppressed Jun 07 11:29:42 hostname kernel: dockerd: page allocation failure: order:5, mode:0x40dc0(GFP_KERNEL|__GFP_COMP|__GFP_ZERO), nodemask=(null),cpuset=docker.service,mems_allowed=0 Jun 07 11:29:42 hostname kernel: CPU: 8 PID: 272332 Comm: dockerd Not tainted 5.15.0-70-generic #77-Ubuntu Jun 07 11:29:42 hostname kernel: Hardware name: VMware, Inc. VMware7,1/440BX Desktop Reference Platform, BIOS VMW71.00V.18227214.B64.2106252220 06/25/2021 Jun 07 11:29:42 hostname kernel: Call Trace: Jun 07 11:29:42 hostname kernel: Jun 07 11:29:42 hostname kernel: show_stack+0x52/0x5c Jun 07 11:29:42 hostname kernel: dump_stack_lvl+0x4a/0x63 Jun 07 11:29:42 hostname kernel: dump_stack+0x10/0x16 Jun 07 11:29:42 hostname kernel: warn_alloc+0x138/0x160 Jun 07 11:29:42 hostname kernel: __alloc_pages_slowpath.constprop.0+0xa44/0xa80 Jun 07 11:29:42 hostname kernel: __alloc_pages+0x311/0x330 Jun 07 11:29:42 hostname kernel: alloc_pages+0x9e/0x1e0 Jun 07 11:29:42 hostname kernel: kmalloc_order+0x2f/0xd0 Jun 07 11:29:42 hostname kernel: kmalloc_order_trace+0x1d/0x90 Jun 07 11:29:42 hostname kernel: __kmalloc+0x2b1/0x330 Jun 07 11:29:42 hostname kernel: veth_alloc_queues+0x25/0x80 [veth] Jun 07 11:29:42 hostname kernel: veth_dev_init+0x72/0xd0 [veth] Jun 07 11:29:42 hostname kernel: register_netdevice+0x119/0x650 Jun 07 11:29:42 hostname kernel: veth_newlink+0x1a1/0x410 [veth] Jun 07 11:29:42 hostname kernel: __rtnl_newlink+0x772/0xa20 Jun 07 11:29:42 hostname kernel: ? __ext4_journal_get_write_access+0x8f/0x1b0 Jun 07 11:29:42 hostname kernel: ? __cond_resched+0x1a/0x50 Jun 07 11:29:42 hostname kernel: ? __getblk_gfp+0x2f/0xf0 Jun 07 11:29:42 hostname kernel: ? ext4_get_group_desc+0x68/0x100 Jun 07 11:29:42 hostname kernel: ? rmqueue_bulk+0x28b/0x770 Jun 07 11:29:42 hostname kernel: ? __nla_validate_parse+0x12f/0x1b0 Jun 07 11:29:42 hostname kernel: rtnl_newlink+0x49/0x70 Jun 07 11:29:42 hostname kernel: rtnetlink_rcv_msg+0x15a/0x400 Jun 07 11:29:42 hostname kernel: ? rtnl_calcit.isra.0+0x130/0x130 Jun 07 11:29:42 hostname kernel: netlink_rcv_skb+0x53/0x100 Jun 07 11:29:42 hostname kernel: rtnetlink_rcv+0x15/0x20 Jun 07 11:29:42 hostname kernel: netlink_unicast+0x220/0x340 Jun 07 11:29:42 hostname kernel: netlink_sendmsg+0x24b/0x4c0 Jun 07 11:29:42 hostname kernel: sock_sendmsg+0x66/0x70 Jun 07 11:29:42 hostname kernel: __sys_sendto+0x113/0x190 Jun 07 11:29:42 hostname kernel: ? move_addr_to_user+0x4d/0xe0 Jun 07 11:29:42 hostname kernel: ? fput+0x13/0x20 Jun 07 11:29:42 hostname kernel: __x64_sys_sendto+0x24/0x30 Jun 07 11:29:42 hostname kernel: do_syscall_64+0x59/0xc0 Jun 07 11:29:42 hostname kernel: ? exit_to_user_mode_prepare+0x37/0xb0 Jun 07 11:29:42 hostname kernel: ? syscall_exit_to_user_mode+0x27/0x50 Jun 07 11:29:42 hostname kernel: ? __x64_sys_recvfrom+0x24/0x30 Jun 07 11:29:42 hostname kernel: ? do_syscall_64+0x69/0xc0 Jun 07 11:29:42 hostname kernel: ? exit_to_user_mode_prepare+0x37/0xb0 Jun 07 11:29:42 hostname kernel: ? syscall_exit_to_user_mode+0x27/0x50 Jun 07 11:29:42 hostname kernel: ? __x64_sys_getsockname+0x19/0x30 Jun 07 11:29:42 hostname kernel: ? do_syscall_64+0x69/0xc0 Jun 07 11:29:42 hostname kernel: ? exit_to_user_mode_prepare+0x37/0xb0 Jun 07 11:29:42 hostname kernel: ? syscall_exit_to_user_mode+0x27/0x50 Jun 07 11:29:42 hostname kernel: ? __x64_sys_sendto+0x24/0x30 Jun 07 11:29:42 hostname kernel: ? do_syscall_64+0x69/0xc0 Jun 07 11:29:42 hostname kernel: ? irqentry_exit+0x1d/0x30 Jun 07 11:29:42 hostname kernel: ? exc_page_fault+0x89/0x170 Jun 07 11:29:42 hostname kernel: entry_SYSCALL_64_after_hwframe+0x61/0xcb Jun 07 11:29:42 hostname kernel: RIP: 0033:0x56230e25d36e Jun 07 11:29:42 hostname kernel: Code: 48 89 6c 24 38 48 8d 6c 24 38 e8 0d 00 00 00 48 8b 6c 24 38 48 83 c4 40 c3 cc cc cc 49 89 f2 48 89 fa 48 89 ce 48 89 df 0f 05 <48> 3d 01 f0 ff ff 76 15 48 f7 d8 48 89 c1 48 c7 c0 ff ff ff ff 48 Jun 07 11:29:42 hostname kernel: RSP: 002b:000000c0072a3630 EFLAGS: 00000202 ORIG_RAX: 000000000000002c Jun 07 11:29:42 hostname kernel: RAX: ffffffffffffffda RBX: 000000000000000c RCX: 000056230e25d36e Jun 07 11:29:42 hostname kernel: RDX: 0000000000000074 RSI: 000000c004746680 RDI: 000000000000000c Jun 07 11:29:42 hostname kernel: RBP: 000000c0072a3670 R08: 000000c000d81630 R09: 000000000000000c Jun 07 11:29:42 hostname kernel: R10: 0000000000000000 R11: 0000000000000202 R12: 0000000000000000 Jun 07 11:29:42 hostname kernel: R13: 0000000000000000 R14: 000000c003273a00 R15: ffffffffffffffff Jun 07 11:29:42 hostname kernel: Jun 07 11:29:42 hostname kernel: Mem-Info: Jun 07 11:29:42 hostname kernel: active_anon:364973 inactive_anon:500179 isolated_anon:0 active_file:1064559 inactive_file:1485684 isolated_file:0 unevictable:0 dirty:6200 writeback:0 slab_reclaimable:227333 slab_unreclaimable:144023 mapped:53109 shmem:2037 pagetables:13459 bounce:0 kernel_misc_reclaimable:0 free:77810 free_pcp:461 free_cma:0 Jun 07 11:29:42 hostname kernel: Node 0 active_anon:1459892kB inactive_anon:2000716kB active_file:4258236kB inactive_file:5942736kB unevictable:0kB isolated(anon):0kB isolated(file):0kB mapped:212436kB dirty:24800kB writeback:0kB shmem:8148kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 514048kB writeback_tmp:0kB kernel_stack:688> Jun 07 11:29:42 hostname kernel: Node 0 DMA free:13312kB min:60kB low:72kB high:84kB reserved_highatomic:0KB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:15996kB managed:15360kB mlocked:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB Jun 07 11:29:42 hostname kernel: lowmem_reserve[]: 0 2923 15876 15876 15876 Jun 07 11:29:42 hostname kernel: Node 0 DMA32 free:92748kB min:20624kB low:23732kB high:26840kB reserved_highatomic:2048KB active_anon:176236kB inactive_anon:485484kB active_file:761908kB inactive_file:1242652kB unevictable:0kB writepending:6404kB present:3128720kB managed:3063108kB mlocked:0kB bounce:0kB free_pcp:24kB local_pcp:0> Jun 07 11:29:42 hostname kernel: lowmem_reserve[]: 0 0 12952 12952 12952 Jun 07 11:29:42 hostname kernel: Node 0 Normal free:205180kB min:71468kB low:85236kB high:99004kB reserved_highatomic:94208KB active_anon:1283928kB inactive_anon:1514924kB active_file:3494808kB inactive_file:4701152kB unevictable:0kB writepending:17196kB present:13631488kB managed:13272040kB mlocked:0kB bounce:0kB free_pcp:1840kB > Jun 07 11:29:42 hostname kernel: lowmem_reserve[]: 0 0 0 0 0 Jun 07 11:29:42 hostname kernel: Node 0 DMA: 0*4kB 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 1*1024kB (U) 2*2048kB (UM) 2*4096kB (M) = 13312kB Jun 07 11:29:42 hostname kernel: Node 0 DMA32: 3875*4kB (UMEH) 2041*8kB (UMEH) 1524*16kB (UMEH) 758*32kB (UMEH) 209*64kB (UMH) 4*128kB (H) 1*256kB (H) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 94612kB Jun 07 11:29:42 hostname kernel: Node 0 Normal: 8576*4kB (UMEH) 3921*8kB (UMEH) 4568*16kB (UMEH) 1558*32kB (UMEH) 124*64kB (UH) 49*128kB (H) 12*256kB (H) 1*512kB (H) 1*1024kB (H) 0*2048kB 0*4096kB = 207432kB Jun 07 11:29:42 hostname kernel: Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB Jun 07 11:29:42 hostname kernel: 2556889 total pagecache pages Jun 07 11:29:42 hostname kernel: 4614 pages in swap cache Jun 07 11:29:42 hostname kernel: Swap cache stats: add 15592825, delete 15588440, find 127949791/132883924 Jun 07 11:29:42 hostname kernel: Free swap = 38208kB Jun 07 11:29:42 hostname kernel: Total swap = 524284kB Jun 07 11:29:42 hostname kernel: 4194051 pages RAM Jun 07 11:29:42 hostname kernel: 0 pages HighMem/MovableOnly Jun 07 11:29:42 hostname kernel: 106424 pages reserved Jun 07 11:29:42 hostname kernel: 0 pages hwpoisoned ```

From just doing a quick glance on what is going on here it appears like the kernel is unable to allocate 128 KiB of contiguous memory as requested. We have >8GB of RAM free, but apparently this does not mean anything since it can be chopped up into many smaller pieces which makes this call unable to complete.

Doing some quick reading I tested to execute the following two commands (as the root user):

echo 3 > /proc/sys/vm/drop_caches
echo 1 > /proc/sys/vm/compact_memory

and this appears to at help with the issue. However, on our high workload server this only lasted for a couple of minutes before returning again, while the low workload server has gone >24 hours without reporting this issue.

Throwing this into the conversation in hope that we manage to get further in understanding what goes wrong.

pschoen-itsc commented 1 year ago

@JonasAlfredsson If this is the case than there is a clear connection to the already linked haskell issue, which refers to this kernel fix:

mas_empty_area() was incorrectly returning an error when there was room.

The maple tree was only introduced recently, so it makes sense that the problem did not occur on an older kernel.

We for example use 6.2.0-azure, which does not have the fix, but the most recent 6.1.57 has it. So you have to be very careful which version you use, because some distros have the opinion, that you only need the security fixes and other bugs are not important...

WolleTD commented 1 year ago

I can also report that this doesn't appear to happen on an Arch Linux machine which is starting a lot of containers for CI jobs (which I keep in my production enviroment for things like this), while the Debian Bookworm next to it experiences this all the time. That would also support that patch as an underlying issue.

Bookworm is at 6.1.55 right now, so I guess I'll wait for the next kernel update and see if the problem disappears?

pschoen-itsc commented 1 year ago

We updated our Ubuntu 22.04 Servers to kernel 6.5 (New from the Ubuntu 23.10 release). This kernel has the above mentioned patch and so far the error did not appear anymore. I will report back when we see the error again.

JonasAlfredsson commented 1 year ago

Got a comment from a person that is more familiar with the kernel than me that the page allocation failure: order:5 seemed very large for the system we have, and suggested explicitly setting the nr_cpus kernel parameter at boot since this apparently have an effect on how much memory the system allocates for this call.

We have a VM with 12 cores exposed to it (while the physical host has many more), so I did the following:

echo 'GRUB_CMDLINE_LINUX_DEFAULT="$GRUB_CMDLINE_LINUX_DEFAULT nr_cpus=12"' > /etc/default/grub.d/cpu-limit.cfg  
update-grub
reboot

and while the reboot definitely helped reset the system to a non-fragmented state, we have seen 36 hours of no errors. I will also update this post in case we see the error on this host again.

pschoen-itsc commented 1 year ago

We haven't got any errors since updating our kernel to 6.5, so i think this resolves the issue.

CoyoteWAN commented 1 year ago

We updated our Ubuntu 22.04 Servers to kernel 6.5 (New from the Ubuntu 23.10 release). This kernel has the above mentioned patch and so far the error did not appear anymore. I will report back when we see the error again.

We just did the same a few weeks ago, but updating the kernel did not fix our problem. Was there anything else you did after the kernel update?

pschoen-itsc commented 1 year ago

We updated our Ubuntu 22.04 Servers to kernel 6.5 (New from the Ubuntu 23.10 release). This kernel has the above mentioned patch and so far the error did not appear anymore. I will report back when we see the error again.

We just did the same a few weeks ago, but updating the kernel did not fix our problem. Was there anything else you did after the kernel update?

Nothing I'm aware of. Before that we also tried different things which were suggested here (mostly playing around with swap), but this did not have a direct effect. Now swap is enabled and vm.swappiness = 60

Kernel in use: Linux ****** 6.5.0-1004-oem #4-Ubuntu SMP PREEMPT_DYNAMIC Fri Sep 15 19:52:30 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux

pschoen-itsc commented 1 year ago

@CoyoteWAN Today, one month later, the we had the same error again.

PieterDeelen commented 1 year ago

We applied the nr_cpus workaround suggested by @JonasAlfredsson two weeks ago and have not seen the error since then. We were already running kernel 6.5 before, but that did not help in our case.

JonasAlfredsson commented 1 year ago

Just checked the collected logs from the two servers I experimented on:

Will apply the patch to the other servers now and see what happens.

acdha commented 1 year ago

We applied the workaround suggested by @JonasAlfredsson two weeks ago and have not seen the error since then. We were already running kernel 6.5 before, but that did not help in our case.

We have had similar results on RedHat 8.9 systems with this total kludge in cron:

*/5 * * * * sh -c 'echo 3 > /proc/sys/vm/drop_caches; echo 1 > /proc/sys/vm/compact_memory'

Update 2024-01-08: we still see sporadic failures but only during peak activity when we have a bunch of GitLab CI scheduled builds launching at the same time which keeps every CI runner fully loaded. It's still infrequent then so the workaround is clearly having some impact but is not a solution.

akerouanton commented 1 year ago

I'm wondering; does anyone in this thread tried to report this issue to kernel devs or to their distro? This issue isn't actionable on our side until we're able to reproduce it, and AFAICT we're not; so there's little we can do at this point.

attie-argentum commented 10 months ago

I've been seeing this issue for some time too... I've tried tweaking swappiness and dropping caches, but it's never helped for long, or only improves the chances of things working - the only thing I've found that resolves this is a full reboot, and then it's just a matter of time until it happens again... I think I've tried restarting the docker daemon (and all containers), but I don't remember, so will give that a go this time.

My last boot was 2023-12-20, and it started occurring again on 2024-01-12... the probability appears to be zero for a while, then starts to increase over time, until docker is virtually unusable, and I'm forced to reboot.

As for reproducing it - I don't think an idle system will show this, but rather a system that creates / destroys many containers will probably work its way to this point over time (or quite possibly veth devices specifically).

From a quick look at metrics recorded by Telegraf, nothing in particular stands out - though I did notice a large number of defunct / zombie processes, so we'll see if dealing with them gives my system a new lease of life... 🀞 (I'm not hopeful)

The output in dmesg I see is similar to @JonasAlfredsson's (here):

dmesg output ``` [2548650.485135] docker0: port 7(vethadd18b9) entered blocking state [2548650.485139] docker0: port 7(vethadd18b9) entered disabled state [2548650.485181] device vethadd18b9 entered promiscuous mode [2548650.549225] warn_alloc: 1 callbacks suppressed [2548650.549228] dockerd: page allocation failure: order:4, mode:0x40dc0(GFP_KERNEL|__GFP_COMP|__GFP_ZERO), nodemask=(null),cpuset=docker.service,mems_allowed=0 [2548650.549236] CPU: 8 PID: 3830513 Comm: dockerd Tainted: P O 5.15.0-91-generic #101-Ubuntu [2548650.549239] Hardware name: ASUS System Product Name/Pro WS W790E-SAGE SE, BIOS 0506 04/28/2023 [2548650.549240] Call Trace: [2548650.549241] [2548650.549244] show_stack+0x52/0x5c [2548650.549251] dump_stack_lvl+0x4a/0x63 [2548650.549257] dump_stack+0x10/0x16 [2548650.549259] warn_alloc+0x138/0x160 [2548650.549263] __alloc_pages_slowpath.constprop.0+0xa44/0xa80 [2548650.549265] __alloc_pages+0x311/0x330 [2548650.549267] alloc_pages+0x9e/0x1e0 [2548650.549270] kmalloc_order+0x2f/0xd0 [2548650.549273] kmalloc_order_trace+0x1d/0x90 [2548650.549275] __kmalloc+0x2b1/0x330 [2548650.549279] veth_alloc_queues+0x25/0x80 [veth] [2548650.549282] veth_dev_init+0x72/0xd0 [veth] [2548650.549284] register_netdevice+0x119/0x650 [2548650.549287] veth_newlink+0x258/0x440 [veth] [2548650.549290] __rtnl_newlink+0x77c/0xa50 [2548650.549293] ? __find_get_block+0xe0/0x240 [2548650.549297] ? __nla_validate_parse+0x12f/0x1b0 [2548650.549301] ? __nla_validate_parse+0x12f/0x1b0 [2548650.549303] ? netdev_name_node_lookup+0x36/0x80 [2548650.549306] ? __dev_get_by_name+0xe/0x20 [2548650.549309] rtnl_newlink+0x49/0x70 [2548650.549311] rtnetlink_rcv_msg+0x15a/0x400 [2548650.549313] ? rtnl_calcit.isra.0+0x130/0x130 [2548650.549315] netlink_rcv_skb+0x53/0x100 [2548650.549319] rtnetlink_rcv+0x15/0x20 [2548650.549322] netlink_unicast+0x220/0x340 [2548650.549323] netlink_sendmsg+0x24b/0x4c0 [2548650.549329] sock_sendmsg+0x66/0x70 [2548650.549332] __sys_sendto+0x113/0x190 [2548650.549334] __x64_sys_sendto+0x24/0x30 [2548650.549335] do_syscall_64+0x59/0xc0 [2548650.549338] ? syscall_exit_to_user_mode+0x35/0x50 [2548650.549339] ? __x64_sys_recvfrom+0x24/0x30 [2548650.549341] ? do_syscall_64+0x69/0xc0 [2548650.549342] ? do_syscall_64+0x69/0xc0 [2548650.549343] ? exit_to_user_mode_prepare+0x37/0xb0 [2548650.549346] ? syscall_exit_to_user_mode+0x35/0x50 [2548650.549348] ? do_syscall_64+0x69/0xc0 [2548650.549349] ? irqentry_exit+0x1d/0x30 [2548650.549350] ? exc_page_fault+0x89/0x170 [2548650.549352] entry_SYSCALL_64_after_hwframe+0x62/0xcc [2548650.549355] RIP: 0033:0x5633739e548e [2548650.549357] Code: 48 89 6c 24 38 48 8d 6c 24 38 e8 0d 00 00 00 48 8b 6c 24 38 48 83 c4 40 c3 cc cc cc 49 89 f2 48 89 fa 48 89 ce 48 89 df 0f 05 <48> 3d 01 f0 ff ff 76 15 48 f7 d8 48 89 c1 48 c7 c0 ff ff ff ff 48 [2548650.549359] RSP: 002b:000000c00a5c1908 EFLAGS: 00000202 ORIG_RAX: 000000000000002c [2548650.549361] RAX: ffffffffffffffda RBX: 000000000000000c RCX: 00005633739e548e [2548650.549362] RDX: 0000000000000074 RSI: 000000c0061d3180 RDI: 000000000000000c [2548650.549363] RBP: 000000c00a5c1948 R08: 000000c0017a2d00 R09: 000000000000000c [2548650.549363] R10: 0000000000000000 R11: 0000000000000202 R12: 0000000000000000 [2548650.549364] R13: 0000000000000024 R14: 000000c00f6dc820 R15: 000000c000d01400 [2548650.549366] [2548650.549366] Mem-Info: [2548650.549369] active_anon:10079171 inactive_anon:12477227 isolated_anon:0 active_file:138230 inactive_file:66237 isolated_file:0 unevictable:16 dirty:72 writeback:911 slab_reclaimable:412944 slab_unreclaimable:5207631 mapped:183355 shmem:87571 pagetables:168675 bounce:0 kernel_misc_reclaimable:0 free:6451381 free_pcp:1506 free_cma:0 [2548650.549373] Node 0 active_anon:40316684kB inactive_anon:49908908kB active_file:552920kB inactive_file:264948kB unevictable:64kB isolated(anon):0kB isolated(file):0kB mapped:733420kB dirty:288kB writeback:3644kB shmem:350284kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 415744kB writeback_tmp:0kB kernel_stack:180064kB pagetables:674700kB all_unreclaimable? no [2548650.549377] Node 0 DMA free:11260kB min:0kB low:12kB high:24kB reserved_highatomic:0KB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:15992kB managed:15360kB mlocked:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB [2548650.549380] lowmem_reserve[]: 0 1359 257097 257097 257097 [2548650.549383] Node 0 DMA32 free:1022968kB min:356kB low:1744kB high:3132kB reserved_highatomic:0KB active_anon:23512kB inactive_anon:268248kB active_file:344kB inactive_file:0kB unevictable:0kB writepending:0kB present:1574044kB managed:1507828kB mlocked:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB [2548650.549386] lowmem_reserve[]: 0 0 255737 255737 255737 [2548650.549388] Node 0 Normal free:24771296kB min:953664kB low:1215536kB high:1477408kB reserved_highatomic:0KB active_anon:40293172kB inactive_anon:49640660kB active_file:552576kB inactive_file:264948kB unevictable:64kB writepending:4428kB present:266338304kB managed:261884428kB mlocked:64kB bounce:0kB free_pcp:6500kB local_pcp:0kB free_cma:0kB [2548650.549393] lowmem_reserve[]: 0 0 0 0 0 [2548650.549395] Node 0 DMA: 1*4kB (U) 1*8kB (U) 1*16kB (U) 1*32kB (U) 1*64kB (U) 1*128kB (U) 1*256kB (U) 1*512kB (U) 0*1024kB 1*2048kB (M) 2*4096kB (M) = 11260kB [2548650.549407] Node 0 DMA32: 10822*4kB (ME) 6396*8kB (ME) 3214*16kB (ME) 1269*32kB (ME) 442*64kB (UME) 266*128kB (UME) 196*256kB (UME) 100*512kB (UME) 85*1024kB (ME) 60*2048kB (ME) 113*4096kB (UME) = 1022968kB [2548650.549415] Node 0 Normal: 145980*4kB (U) 1274711*8kB (UE) 677842*16kB (UE) 98444*32kB (UE) 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 24777288kB [2548650.549421] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB [2548650.549423] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB [2548650.549423] 331547 total pagecache pages [2548650.549424] 40340 pages in swap cache [2548650.549425] Swap cache stats: add 3363822, delete 3323478, find 250191679/250345953 [2548650.549425] Free swap = 5336824kB [2548650.549426] Total swap = 16777208kB [2548650.549427] 66982085 pages RAM [2548650.549427] 0 pages HighMem/MovableOnly [2548650.549427] 1130181 pages reserved [2548650.549428] 0 pages hwpoisoned [2548650.834172] eth0: renamed from veth47414c9 [2548650.994237] IPv6: ADDRCONF(NETDEV_CHANGE): vethadd18b9: link becomes ready [2548650.994365] docker0: port 7(vethadd18b9) entered blocking state [2548650.994368] docker0: port 7(vethadd18b9) entered forwarding state ```
System Info Docker is not running inside a VM, but this system is running a number of KVM VMs alongside. ``` $ uname -a Linux kenai 5.15.0-91-generic #101-Ubuntu SMP Tue Nov 14 13:30:08 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux $ lsb_release -a No LSB modules are available. Distributor ID: Ubuntu Description: Ubuntu 22.04.3 LTS Release: 22.04 Codename: jammy $ cat /proc/cpuinfo | sed -nre '/^model name/{p;q}' model name : Intel(R) Xeon(R) w7-3465X $ free -h total used free shared buff/cache available Mem: 251Gi 223Gi 24Gi 341Mi 2.7Gi 24Gi Swap: 15Gi 11Gi 4.5Gi $ dockerd --version Docker version 24.0.7, build 311b9ff $ systemctl status docker.service | head -n 11 ● docker.service - Docker Application Container Engine Loaded: loaded (/lib/systemd/system/docker.service; enabled; vendor preset: enabled) Active: active (running) since Wed 2023-12-20 13:05:20 GMT; 4 weeks 1 day ago TriggeredBy: ● docker.socket Docs: https://docs.docker.com Main PID: 11267 (dockerd) Tasks: 664 Memory: 416.4M CPU: 14h 31min 12.125s CGroup: /system.slice/docker.service β”œβ”€ 11267 /usr/bin/dockerd -H fd:// --containerd=/run/containerd/containerd.sock ```