Closed ghost closed 2 years ago
Have you tested to see if this behavior is unique to our packaging of containerd? Can you reproduce the same behavior with upstream containerd 1.4 when using cgroupv2?
This repository uses a bot to automatically label issues which have not had any activity (commit/comment/label) for 180 days. This helps us manage the community issues better. If the issue is still relevant, please add a comment to the issue so the bot can remove the label and we know it is still valid. If it is no longer relevant (or possibly fixed in the latest release), the bot will automatically close the issue in 14 days. Thank you for your contributions.
Environmental Info: K3s Version: k3s version v1.21.6+k3s1 (df033fa2) go version go1.16.8
Node(s) CPU architecture, OS, and Version: Linux hostname 5.11.0-1017-aws #18~20.04.1-Ubuntu SMP Fri Aug 27 11:21:54 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
Cluster Configuration: K3s v1.21.6+k3s1 cluster with 3 servers and 5 agents all servers are using Linux cgroups v2
Describe the bug: When a process inside a pod is killed due to OOM Containerd doesn't report OOM events. It affects only systems which are using cgroups v2 with v1 it works as expected.
Steps To Reproduce:
[Install] WantedBy=multi-user.target
[Service] Type=exec EnvironmentFile=-/etc/default/%N EnvironmentFile=-/etc/sysconfig/%N EnvironmentFile=-/etc/systemd/system/k3s-agent.service.env KillMode=process Delegate=yes
Having non-zero Limit*s causes performance problems due to accounting overhead
in the kernel. We recommend using cgroups to do container-local accounting.
LimitNOFILE=1048576 LimitNPROC=infinity LimitCORE=infinity TasksMax=infinity TimeoutStartSec=0 Restart=always RestartSec=5s ExecStartPre=/bin/sh -xc '! /usr/bin/systemctl is-enabled --quiet nm-cloud-setup.service' ExecStartPre=-/sbin/modprobe br_netfilter ExecStartPre=-/sbin/modprobe overlay ExecStart=/usr/local/bin/k3s \ agent \ '-c' \ '/etc/rancher/k3s/config.yaml' \ '--server' \ 'https://master:6443' \
stress-oom-crasher
pod execute python code below in order to cause OOM eventExpected behavior: On node where
stress-oom-crasher
pod is runningctr events
should show OOM events, e.g.Actual behavior: There are no OOM events in output of
ctr events
commandAdditional context / logs: I noticed that if run a container manually, e.g.
ctr run -t --memory-limit=126000000 docker.io/library/python:3.9.9 test_oom bash
and generate OOM then expected/tasks/oom
event is shown in output ofctr events
. In this case corresponding cgroup is created under/sys/fs/cgroup/k8s.io/
in case if a container is created by k3s corresponding cgroup is created under/sys/fs/cgroup/kubepods/
.Backporting