rfjakob / earlyoom

earlyoom - Early OOM Daemon for Linux
MIT License
2.85k stars 152 forks source link

earlyoom - The Early OOM Daemon

CI MIT License Latest release

The oom-killer generally has a bad reputation among Linux users. This may be part of the reason Linux invokes it only when it has absolutely no other choice. It will swap out the desktop environment, drop the whole page cache and empty every buffer before it will ultimately kill a process. At least that's what I think that it will do. I have yet to be patient enough to wait for it, sitting in front of an unresponsive system.

This made me and other people wonder if the oom-killer could be configured to step in earlier: reddit r/linux, superuser.com, unix.stackexchange.com.

As it turns out, no, it can't. At least using the in-kernel oom-killer. In the user space, however, we can do whatever we want.

earlyoom wants to be simple and solid. It is written in pure C with no dependencies. An extensive test suite (unit- and integration tests) is written in Go.

What does it do

earlyoom checks the amount of available memory and free swap up to 10 times a second (less often if there is a lot of free memory). By default if both are below 10%, it will kill the largest process (highest oom_score). The percentage value is configurable via command line arguments.

In the free -m output below, the available memory is 2170 MiB and the free swap is 231 MiB.

              total        used        free      shared  buff/cache   available
Mem:           7842        4523         137         841        3182        2170
Swap:          1023         792         231

Why is "available" memory checked as opposed to "free" memory? On a healthy Linux system, "free" memory is supposed to be close to zero, because Linux uses all available physical memory to cache disk access. These caches can be dropped any time the memory is needed for something else.

The "available" memory accounts for that. It sums up all memory that is unused or can be freed immediately.

Note that you need a recent version of free and Linux kernel 3.14+ to see the "available" column. If you have a recent kernel, but an old version of free, you can get the value from grep MemAvailable /proc/meminfo.

When both your available memory and free swap drop below 10% of the total memory available to userspace processes (=total-shared), it will send the SIGTERM signal to the process that uses the most memory in the opinion of the kernel (/proc/*/oom_score).

See also

Why not trigger the kernel oom killer?

earlyoom does not use echo f > /proc/sysrq-trigger because:

In some kernel versions (tested on v4.0.5), triggering the kernel oom killer manually does not work at all. That is, it may only free some graphics memory (that will be allocated immediately again) and not actually kill any process. Here you can see how this looks like on my machine (Intel integrated graphics).

This problem has been fixed in Linux v5.17 (commit f530243a) .

Like the Linux kernel would, earlyoom finds its victim by reading through /proc/*/oom_score.

How much memory does earlyoom use?

About 2 MiB (VmRSS), though only 220 kiB is private memory (RssAnon). The rest is the libc library (RssFile) that is shared with other processes. All memory is locked using mlockall() to make sure earlyoom does not slow down in low memory situations.

Download and compile

Compiling yourself is easy:

git clone https://github.com/rfjakob/earlyoom.git
cd earlyoom
make

Optional: Run the integrated self-tests:

make test

Start earlyoom automatically by registering it as a service:

sudo make install              # systemd
sudo make install-initscript   # non-systemd

Note that for systems with SELinux disabled (Ubuntu 19.04, Debian 9 ...) chcon warnings reporting failure to set the context can be safely ignored.

For Debian 10+ and Ubuntu 18.04+, there's a Debian package:

sudo apt install earlyoom

For Fedora and RHEL 8 with EPEL, there's a Fedora package:

sudo dnf install earlyoom
sudo systemctl enable --now earlyoom

For Arch Linux, there's an Arch Linux package:

sudo pacman -S earlyoom
sudo systemctl enable --now earlyoom

Availability in other distributions: see repology page.

Use

Just start the executable you have just compiled:

./earlyoom

It will inform you how much memory and swap you have, what the minimum is, how much memory is available and how much swap is free.

./earlyoom
eearlyoom v1.8
mem total: 23890 MiB, user mem total: 21701 MiB, swap total: 8191 MiB
sending SIGTERM when mem avail <= 10.00% and swap free <= 10.00%,
        SIGKILL when mem avail <=  5.00% and swap free <=  5.00%
mem avail: 20012 of 21701 MiB (92.22%), swap free: 5251 of 8191 MiB (64.11%)
mem avail: 20031 of 21721 MiB (92.22%), swap free: 5251 of 8191 MiB (64.11%)
mem avail: 20033 of 21723 MiB (92.22%), swap free: 5251 of 8191 MiB (64.11%)
[...]

If the values drop below the minimum, processes are killed until it is above the minimum again. Every action is logged to stderr. If you are running earlyoom as a systemd service, you can view the last 10 lines using

systemctl status earlyoom

Testing

In order to see earlyoom in action, create/simulate a memory leak and let earlyoom do what it does:

tail /dev/zero

Checking Logs

If you need any further actions after a process is killed by earlyoom (such as sending emails), you can parse the logs by:

sudo journalctl -u earlyoom | grep sending

Example output for above test command (tail /dev/zero) will look like:

Feb 20 10:59:34 debian earlyoom[10231]: sending SIGTERM to process 7378 uid 1000 "tail": oom_score 156, VmRSS 4962 MiB

For older versions of earlyoom, use:

sudo journalctl -u earlyoom | grep -iE "(sending|killing)"

Notifications

Since version 1.6, earlyoom can send notifications about killed processes via the system d-bus. Pass -n to enable them.

To actually see the notifications in your GUI session, you need to have systembus-notify running as your user.

Additionally, earlyoom can execute a script for each process killed, providing information about the process via the EARLYOOM_PID, EARLYOOM_UID and EARLYOOM_NAME environment variables. Pass -N /path/to/script to enable.

Warning: In case of dryrun mode, the script will be executed in rapid succession, ensure you have some sort of rate-limit implemented.

Preferred Processes

The command-line flag --prefer specifies processes to prefer killing; likewise, --avoid specifies processes to avoid killing. See https://github.com/rfjakob/earlyoom/blob/master/MANPAGE.md#--prefer-regex for details.

Configuration file

If you are running earlyoom as a system service (through systemd or init.d), you can adjust its configuration via the file provided in /etc/default/earlyoom. The file already contains some examples in the comments, which you can use to build your own set of configuration based on the supported command line options, for example:

EARLYOOM_ARGS="-m 5 -r 60 --avoid '(^|/)(init|Xorg|ssh)$' --prefer '(^|/)(java|chromium)$'"

After adjusting the file, simply restart the service to apply the changes. For example, for systemd:

systemctl restart earlyoom

Please note that this configuration file has no effect on earlyoom instances outside of systemd/init.d.

Command line options

earlyoom v1.8
Usage: ./earlyoom [OPTION]...

  -m PERCENT[,KILL_PERCENT] set available memory minimum to PERCENT of total
                            (default 10 %).
                            earlyoom sends SIGTERM once below PERCENT, then
                            SIGKILL once below KILL_PERCENT (default PERCENT/2).
  -s PERCENT[,KILL_PERCENT] set free swap minimum to PERCENT of total (default
                            10 %).
                            Note: both memory and swap must be below minimum for
                            earlyoom to act.
  -M SIZE[,KILL_SIZE]       set available memory minimum to SIZE KiB
  -S SIZE[,KILL_SIZE]       set free swap minimum to SIZE KiB
  -n                        enable d-bus notifications
  -N /PATH/TO/SCRIPT        call script after oom kill
  -g                        kill all processes within a process group
  -d, --debug               enable debugging messages
  -v                        print version information and exit
  -r INTERVAL               memory report interval in seconds (default 1), set
                            to 0 to disable completely
  -p                        set niceness of earlyoom to -20 and oom_score_adj to
                            -100
  --ignore-root-user        do not kill processes owned by root
  --sort-by-rss             find process with the largest rss (default oom_score)
  --prefer REGEX            prefer to kill processes matching REGEX
  --avoid REGEX             avoid killing processes matching REGEX
  --ignore REGEX            ignore processes matching REGEX
  --dryrun                  dry run (do not kill any processes)
  --syslog                  use syslog instead of std streams
  -h, --help                this help text

See the man page for details.

Contribute

Bug reports and pull requests are welcome via github. In particular, I am glad to accept

Implementation Notes

Changelog