Open dustymabe opened 9 months ago
We should probably add a QEMU test that sets e.g. -rtc=1970-01-01
to get coverage on these types of issues. (I don't think we need to try to get things to work perfectly in that mode, but at least not completely breaking boot.)
Any workarounds for this when you cannot ssh into rpi4? Several rpi4 now claims "system is boosting up" so cannot ssh into them.
Try to catch the kernel command line and add systemd.mask=coreos-ignition-write-issues.service
to it. I think that should let you get the system up and then you can apply the workaround as suggested in the description here.
I've engaged @keszybz on this issue and he is looking into it so I'm hoping we'll have a fix soon.
Alternatively we should probably look at not running journalctl --list-boots
on boot and using some other mechanism for getting this information.
Proposed fix to the infinite loop problem in https://github.com/systemd/systemd/pull/31975
Though, note: that PR doesn't address:
Alternatively we should probably look at not running
journalctl --list-boots
on boot and using some other mechanism for getting this information.
Proposed fix to the infinite loop problem in systemd/systemd#31975
Merged in https://github.com/systemd/systemd/commit/1e8c0c671e3076db811804343b3b8d744bcf27ac
Fixed in systemd v256 and newer which is in F41 now.
The fix for this went into next
stream release 41.20240916.1.0
. Please try out the new release and report issues.
This was originally reported in https://discussion.fedoraproject.org/t/fcos-39-fails-booting-due-journalctl-list-boots-never-return/97006/6
The
coreos-ignition-write-issues.service
service hangs on boot and prevents the boot from progressing further. The [call]() that is getting stuck isjournalctl --list-boots
.For some reason after the transition to F39 there is some corner case where
journalctl --list-boots
can just run indefinitely.I had a Rpi4 that experienced this on the
38.20231027.3.2 -> 39.20231101.3.0
transition.Booting the old
38.20231027.3.2
entry I was able to get the system up and running again. Inspecting the system showed:I added a TimeoutStartSec override as suggested in the discussion forum post
and was able to then upgrade to F39.
Once the system is up I then see:
I had to
CTRL-C
out of the process after almost 2 hours. Before the upgrade the process finished in less than a second.As suggested in the discussion forum we should probably find a better way to achieve our goal rather than running
journalctl --list-boots
on startup anyway, but there is clearly IMO a bug in systemd here that was introduced withsystemd-253.12-1.fc38 ⟶ 254.5-2.fc39
.