rfjakob / earlyoom

earlyoom - Early OOM Daemon for Linux
MIT License
2.96k stars 157 forks source link

systemd killing earlyoom 1.8.2 on rocky 9 #328

Closed sircharlesxx closed 2 weeks ago

sircharlesxx commented 2 weeks ago

I've compiled earlyoom 1.8.2 on my rocky 9 system. When earlyoom attempts to kill a service, I see the following journalctl output:

earlyoom[1788]: low memory! at or below SIGTERM limits: mem 10.00%, swap 20.00%
earlyoom[1788]: sending SIGTERM to process 1792 uid 0 "tail": oom_score 1259, VmRSS 7112 MiB, cmdline "tail -f /dev/zero"
systemd[1]: earlyoom.service: Main process exited, code=killed, status=31/SYS
systemd[1]: earlyoom.service: Failed with result 'signal'.
systemd[1]: earlyoom.service: Scheduled restart job, restart counter is at 1.
systemd[1]: Stopped Early OOM Daemon.
systemd[1]: Started Early OOM Daemon.

If I remove the following line from /usr/lib/systemd/system/earlyoom.service this issue doesn't occur

SystemCallFilter=@system-service process_mrelease

Any advice on how to proceed?

rfjakob commented 2 weeks ago

Hi, can you run "strace -p" against earlyoom when you trigger this? I want to see what syscall is causing this.

On Tue, 29 Oct 2024, 16:32 sircharlesxx, @.***> wrote:

I've compiled earlyoom 1.8.2 on my rocky 9 system. When earlyoom attempts to kill a service, I see the following journalctl output:

earlyoom[1788]: low memory! at or below SIGTERM limits: mem 10.00%, swap 20.00% earlyoom[1788]: sending SIGTERM to process 1792 uid 0 "tail": oom_score 1259, VmRSS 7112 MiB, cmdline "tail -f /dev/zero" systemd[1]: earlyoom.service: Main process exited, code=killed, status=31/SYS systemd[1]: earlyoom.service: Failed with result 'signal'. systemd[1]: earlyoom.service: Scheduled restart job, restart counter is at 1. systemd[1]: Stopped Early OOM Daemon. systemd[1]: Started Early OOM Daemon.

If I remove the following line from /usr/lib/systemd/system/earlyoom.service this issue doesn't occur

@.*** process_mrelease

Any advice on how to proceed?

— Reply to this email directly, view it on GitHub https://github.com/rfjakob/earlyoom/issues/328, or unsubscribe https://github.com/notifications/unsubscribe-auth/AACGA75JNJATU5NWNNTYWQLZ56TA5AVCNFSM6AAAAABQ2EMDYOVHI2DSMVQWIX3LMV43ASLTON2WKOZSGYZDCNRRHA2DGMY . You are receiving this because you are subscribed to this thread.Message ID: @.***>

sircharlesxx commented 2 weeks ago

earlyoom-strace-10-29-24.txt Sure! I've attached the entire output.

rfjakob commented 2 weeks ago

Can you also check the audit log (see below)


https://www.reddit.com/r/systemd/s/EK45M4SooU

You should have a SECCOMP audit message generated when that happens. Check:

journalctl _AUDIT_TYPE_NAME=SECCOMP

The syscall field will contain the syscall number.

On Tue, 29 Oct 2024, 19:43 sircharlesxx, @.***> wrote:

earlyoom-strace-10-29-24.txt https://github.com/user-attachments/files/17561673/earlyoom-strace-10-29-24.txt Sure! I've attached the entire output.

— Reply to this email directly, view it on GitHub https://github.com/rfjakob/earlyoom/issues/328#issuecomment-2445066708, or unsubscribe https://github.com/notifications/unsubscribe-auth/AACGA75LFCEBDNI7O6KK72DZ57JMJAVCNFSM6AAAAABQ2EMDYOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDINBVGA3DMNZQHA . You are receiving this because you commented.Message ID: @.***>

sircharlesxx commented 2 weeks ago

That exact command wasn't working for me for whatever reason, but I believe the audit.log should have what you are looking for. Here is the output of that.

type=SECCOMP msg=audit(1730230250.468:218): auid=4294967295 uid=61876 gid=61876 ses=4294967295 pid=1756 comm="earlyoom" exe="/usr/bin/earlyoom" sig=31 arch=c000003e syscall=448 compat=0 ip=0x7f804bf0713d code=0x80000000AUID="unset" UID="unknown(61876)" GID="earlyoom" ARCH=x86_64 SYSCALL=unknown-syscall(-1)
type=ANOM_ABEND msg=audit(1730230250.468:219): auid=4294967295 uid=61876 gid=61876 ses=4294967295 pid=1756 comm="earlyoom" exe="/usr/bin/earlyoom" sig=31 res=1AUID="unset" UID="unknown(61876)" GID="earlyoom"
type=SERVICE_STOP msg=audit(1730230250.470:220): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=earlyoom comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=failed'UID="root" AUID="unset"
type=BPF msg=audit(1730230250.581:221): prog-id=59 op=UNLOAD
type=SERVICE_START msg=audit(1730230250.738:222): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=earlyoom comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'UID="root" AUID="unset"
type=SERVICE_STOP msg=audit(1730230250.738:223): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=earlyoom comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'UID="root" AUID="unset"
type=BPF msg=audit(1730230250.740:224): prog-id=62 op=LOAD
type=BPF msg=audit(1730230250.740:225): prog-id=63 op=LOAD
type=BPF msg=audit(1730230250.740:226): prog-id=64 op=LOAD
type=BPF msg=audit(1730230250.740:227): prog-id=60 op=UNLOAD
type=BPF msg=audit(1730230250.740:228): prog-id=61 op=UNLOAD
type=SERVICE_START msg=audit(1730230250.751:229): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=earlyoom comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'UID="root" AUID="unset"

If that isn't helpful please let me know, and I can try and get your journalctl command to work. For some reason I'm getting the following from journalctl _AUDIT_TYPE_NAME=SECCOMP:

-- No entries --
rfjakob commented 2 weeks ago

Smoking gun from strace:

process_mrelease(4, 0) = 448 +++ killed by SIGSYS +++

Smoking gun from seccomp:

type=SECCOMP msg=audit(1730230250.468:218): auid=4294967295 uid=61876 gid=61876 ses=4294967295 pid=1756 comm="earlyoom" exe="/usr/bin/earlyoom" sig=31 arch=c000003e syscall=448 compat=0 ip=0x7f804bf0713d code=0x80000000AUID="unset" UID="unknown(61876)" GID="earlyoom" ARCH=x86_64 SYSCALL=unknown-syscall(-1)

Note: syscall 448 = process_mrelease acc. to https://filippo.io/linux-syscall-table/

Looks like the line

SystemCallFilter=@system-service process_mrelease

does not work as it should. Maybe your systemd is too old to recognize process_mrelease, which is somewhat new?

rfjakob commented 2 weeks ago

Can you check where process_mrelease appears in

sudo systemd-analyze syscall-filter

?

Here it appears in the section "Ungrouped System Calls".

rfjakob commented 2 weeks ago

Also, please check if you got a warning like this on boot (or you can trigger it again via systemctl daemon-reload):

System call process_mrelease is not known, ignoring.

sircharlesxx commented 2 weeks ago

I was wondering that as well, here is the output of: systemd-analyze syscall-filter

@file-system
    # File system operations
    access
    chdir
    chmod
    close
    creat
    faccessat
    faccessat2
    fallocate
    fchdir
    fchmod
    fchmodat
    fcntl
    fcntl64
    fgetxattr
    flistxattr
...skipping...
    process_mrelease
    process_vm_readv
    process_vm_writev
    pselect6
    pselect6_time64

I see it shows as skipping, I am running systemd-252-32 and 5.14.0-427.40.1.el9_4.x86_64

I do see this output when running systemd daemon-reload:

Oct 29 15:01:27 systemd[1]: /usr/lib/systemd/system/earlyoom.service:47: Failed to parse system call, ignoring: process_mrelease

If I make this edit in /usr/lib/systemd/system/earlyoom.service to remove just the process_mrelease part, it seems I'm still getting the same result when earlyoom kills a pid:

SystemCallArchitectures=native
#SystemCallFilter=@system-service process_mrelease
SystemCallFilter=@system-service
SystemCallFilter=~@privileged
type=SECCOMP msg=audit(1730231818.492:264): auid=4294967295 uid=61876 gid=61876 ses=4294967295 pid=1814 comm="earlyoom" exe="/usr/bin/earlyoom" sig=31 arch=c000003e syscall=448 compat=0 ip=0x7f4b8f30713d code=0x80000000AUID="unset" UID="unknown(61876)" GID="earlyoom" ARCH=x86_64 SYSCALL=unknown-syscall(-1)
type=ANOM_ABEND msg=audit(1730231818.492:265): auid=4294967295 uid=61876 gid=61876 ses=4294967295 pid=1814 comm="earlyoom" exe="/usr/bin/earlyoom" sig=31 res=1AUID="unset" UID="unknown(61876)" GID="earlyoom"
type=SERVICE_STOP msg=audit(1730231818.494:266): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=earlyoom comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=failed'UID="root" AUID="unset"
type=BPF msg=audit(1730231818.644:267): prog-id=78 op=UNLOAD
type=SERVICE_START msg=audit(1730231818.737:268): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=earlyoom comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'UID="root" AUID="unset"
type=SERVICE_STOP msg=audit(1730231818.737:269): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=earlyoom comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'UID="root" AUID="unset"
type=BPF msg=audit(1730231818.738:270): prog-id=81 op=LOAD
type=BPF msg=audit(1730231818.738:271): prog-id=82 op=LOAD
type=BPF msg=audit(1730231818.738:272): prog-id=83 op=LOAD
type=BPF msg=audit(1730231818.738:273): prog-id=79 op=UNLOAD
type=BPF msg=audit(1730231818.738:274): prog-id=80 op=UNLOAD
type=SERVICE_START msg=audit(1730231818.750:275): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=earlyoom comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'UID="root" AUID="unset"
rfjakob commented 2 weeks ago

You'll have to remove the line

SystemCallFilter=@system-service process_mrelease

entirely.

sircharlesxx commented 2 weeks ago

Okay thank you. I'll go ahead and mark this as closed, since it's an issue on my end with not having support for process_mrelease