add unit to drop to emergency shell after a timeout if still in initramfs

dustymabe commented 3 years ago

There are a few things that will wait forever for a remote resource to exist before continuing. One of them is the pxe rootfs downloading (https://github.com/coreos/fedora-coreos-config/pull/964) the other is Ignition trying to fetch a remote Ignition config.

Sometimes it's useful to break out of the failure and debug the environment. Maybe we could add a systemd unit that would takek an argument for a timeout and if that timeout is reached then we break into the emergency shell.

Something like rd.coreos.emergency.timeout=600. If that option doesn't exist the unit doesn't run. If it does it runs sleep 600 and then drops to emergency shell

A counter argument to a feature like this is to just have someone add a rd.break argument and debug that way. Thought I could see a user choosing to always put rd.coreos.emergency.timeout=600 on the kernel command line to just get to an emergency shell automatically any time things are taking longer than X amount of time.

travier commented 3 years ago

We discussed that in the community meeting today.

We agreed initially to make it first boot only but might reconsider to enable it for all boots given that we don't really have a good threat model where this would be worse than the current status:

* AGREED: we think having a generic configurable timeout via a kernel
argument to break if things wait infinitely would be useful. We just
want to make sure that it's not on by default and that it only
applies to the first boot of a machine. (dustymabe, 17:21:16)

Related console access issue: https://github.com/coreos/fedora-coreos-tracker/issues/805

jlebon commented 3 years ago

coreos / fedora-coreos-tracker

add unit to drop to emergency shell after a timeout if still in initramfs #796