systemd / systemd

The systemd System and Service Manager
https://systemd.io
GNU General Public License v2.0
13.26k stars 3.79k forks source link

Coredump of `plymouthd` is not saved if it is started on a non-systemd initrd, and left running over the initrd→host transition #34334

Closed filip-hejsek closed 1 month ago

filip-hejsek commented 2 months ago

systemd version the issue has been seen with

256.5-1-arch

Used distribution

Arch Linux

Linux kernel version used

6.10.8-arch1-1

CPU architectures issue was seen on

x86_64

Component

systemd-coredump

Expected behaviour you didn't see

If plymouthd crashes during boot, the coredump should be saved. It should be possible to debug the coredump using coredumpctl debug.

Unexpected behaviour you saw

plymouthd crashes during boot. systemd-coredump logs the crash in journal but doesn't save the coredump.

This is caused by the following sequence of events:

  1. plymouthd is started from initrd
  2. plymouthd sets its argv[0][0] to @ to avoid being killed by systemd
  3. initrd switches to real rootfs and starts systemd
  4. systemd (or something else?) moves plymouthd to init.scope
  5. plymouthd crashes
  6. kernel runs systemd-coredump
  7. systemd-coredump (wrongly) determines that this is a pid1 crash because plymouthd is in init.scope: https://github.com/systemd/systemd/blob/8b29949a4142318bacf3d30751aa37b8f29b5c1e/src/coredump/coredump.c#L1039
  8. systemd-coredump decides to disable coredump collection https://github.com/systemd/systemd/blob/8b29949a4142318bacf3d30751aa37b8f29b5c1e/src/coredump/coredump.c#L1752-L1755

Steps to reproduce the problem

No response

Additional program output to the terminal or log subsystem illustrating the issue

systemd-coredump[953]: Process 230 (plymouthd) of user 0 terminated abnormally with signal 11/SEGV, processing...
systemd-coredump[953]: Due to PID 1 having crashed coredump collection will now be turned off.
systemd-coredump[953]: Resource limits disable core dumping for process 230 (plymouthd).
systemd-coredump[953]: Process 230 (plymouthd) of user 0 terminated abnormally without generating a coredump.
poettering commented 2 months ago

systemd (or something else?) moves plymouthd to init.scope

how did that happen?

is the initrd using systemd?

filip-hejsek commented 2 months ago

systemd (or something else?) moves plymouthd to init.scope

how did that happen?

I have no idea. I will try to investigate later today.

This is from the coredump log entry:

    COREDUMP_PID=230
    COREDUMP_UID=0
    COREDUMP_GID=0
    COREDUMP_SIGNAL_NAME=SIGSEGV
    COREDUMP_SIGNAL=11
    COREDUMP_RLIMIT=0
    COREDUMP_HOSTNAME=filip-e15
    COREDUMP_COMM=plymouthd
    COREDUMP_EXE=/usr/bin/plymouthd
    COREDUMP_UNIT=init.scope
    COREDUMP_SLICE=-.slice
    COREDUMP_CMDLINE=@lymouthd --mode=boot --pid-file=/run/plymouth/pid --attach-to-session
    COREDUMP_CGROUP=/init.scope

is the initrd using systemd?

No, it is standard Arch initrd generated by mkinitcpio with the following config:

HOOKS=(base udev autodetect modconf keyboard keymap block plymouth resume filesystems)

Here are links to the most important pieces if you want to see them: init script, functions used by init script, hook which starts plymouthd.

filip-hejsek commented 1 month ago

I will try to investigate later today.

I have done that investigation and it is definitely systemd which is moving plymouthd into init.scope.

Here is a (lightly edited) transcript of a demonstration of this behavior:

[host]# systemd-nspawn -U -D / -x
# sleep infinity &
[1] 10
# systemd-cgls
CGroup /:
-.slice
├─ 1 -bash
├─10 sleep infinity
├─11 systemd-cgls
└─12 less
# exec init systemd.unit=emergency.target
[boot messages and root login...]
# systemd-cgls
CGroup /:
-.slice
├─init.scope
│ ├─ 1 init systemd.unit=emergency.target
│ └─10 sleep infinity
└─system.slice
  └─emergency.service
    ├─35 /usr/lib/systemd/systemd-sulogin-shell emergency
    ├─36 bash
    ├─38 systemd-cgls
    └─39 less

Notice how the sleep process (PID 10) is moved from -.slice to init.scope after systemd has started.

poettering commented 1 month ago

so i guess this is not surprising, we cannot really leave any processes in the top-level cgroup (because of the no-processes-in-inner-cgroup rule of cgroup). Hence we move them to init.scope.

And we have no fricking clue wht kind of process this is, hence we move things over blanket.

Dunno. I don't think this is somethign we should try to fix. If you run your initrd without systemd and leave stuff running from it it's kinda a necessary effect. Maybe talk to your initrd maintainers to just use systemd, or live with the fact that this is this way? or drop plymouth? or stop leaving initrd running.

Anyway, I see nothing actinable for us.

YHNdnzj commented 1 month ago

Yeah, please raise this on mkinitcpio issue tracker instead. Also note that mkinitcpio supports systemd-based initramfs and will be made default in the foreseeable future.

Closing this one here.

filip-hejsek commented 1 month ago

If you run your initrd without systemd and leave stuff running from it it's kinda a necessary effect.

The problem is that there is no good alternative. plymouthd was designred to start from initrd and keep running after switching to the real rootfs. The goal is to show the boot splash as early in the boot process as possible and keep it running until GDM is ready to take over the display.

Maybe talk to your initrd maintainers

Yeah, please raise this on mkinitcpio issue tracker instead.

This is not just a mkinitcpio issue. Other initrd generators which don't use systemd (e.g. dracut, initramfs-tools) will have the same problem, because there is no way to avoid it.

You can't say this is their bug if you don't provide them with any means to avoid the problem.

Is using a non-systemd initrd unsupported now?

Or let me ask a different question: What should mkinitcpio do differently to fix this bug? What do you expect them to do if I report this bug to them?

YHNdnzj commented 1 month ago

Is using a non-systemd initrd unsupported now?

It is still supported, but there inevitably would be pitfalls like this due to the inconsistency states. As explained, there's nothing for systemd to fix here. We have no way of determining where something started before us belongs.

And do note that I'm also a mkinitcpio collaborator. I'd probably suggest placing all processes in initrd.scope or such. But this would be possible only if it's brought up at the proper place...

YHNdnzj commented 1 month ago

(I don't use plymouth really, otherwise I'd submit a issue on my own)

filip-hejsek commented 1 month ago

I'd probably suggest placing all processes in initrd.scope or such.

This is an actually helpful suggestion, thanks.

So non-systemd initrds, if they want to retain some process after starting systemd, should just mount cgroup filesystem, create a cgroup, and put the process there? And systemd will be fine with this?

But this would be possible only if it's brought up at the proper place...

I will report this to mkinitcpio issue tracker probably sometime tomorrow.

filip-hejsek commented 1 month ago

https://gitlab.archlinux.org/archlinux/mkinitcpio/mkinitcpio/-/issues/276