RPi-Distro / pi-gen

Tool used to create the official Raspberry Pi OS images
BSD 3-Clause "New" or "Revised" License
2.59k stars 1.62k forks source link

Problems with `initramfs` not being generated #754

Closed jketterl closed 7 months ago

jketterl commented 7 months ago

Preface: I've been hunting this issue down all day and I still don't really know why it's happening, I'm mostly just opening this issue to document my findings, hoping they are useful in some regard. I am inconclusive whether this issue should be fixed here. It has been an interesting journey, that's for sure.

We are building Raspberry Pi images with pre-installed software for the OpenWebRX project, and we based the whole system on pi-gen.

The initial issue reported to me was that the most recent images cannot be booted, and the error shown was "Failed to start kernel-command-line.service - Command from Kernel Command Line". First research lead to this issue which provided some insight. Analysis of the logs showed that the error was related to customizations done with the Raspberry Pi imager, in particular: the path for the firstrun.sh script was incorrect (/boot/firstrun.sh was passed, while the script resides in /boot/firmware/firstrun.sh).

Further insight was found here: Basically, there should be something to fix that path, but for some reason didn't.

More clues were extracted from the actual initramfs on one of the latest (2023-12-11) images of Raspberry Pi OS. In there, I found scripts/local-bottom/imager_fixup which presumably is the embodiment of the workaround mentioned in the issue above.

Further investigation revealed that the initramfs* files were straight up missing from the images we were building. I tried building a "vanilla" image from the 2023-12-05-raspios-bookworm tag of the original pi-gen, and lo and behold, same picture.

I had to go down the rabbit hole of understainding how these initramfs* files are generated in the first place (kernel package postinst script calls /etc/kernel/postinst.d/initramfs-tools, which in turns calls update-initramfs). Having a log of the last workable image also helped a lot in the process.

What I found is that somehow sometime over the last weeks, the sequence of package installation has changed. I don't know exactly what the real cause is, I suspect this is caused by a shift in dependencies somewhere. The result is that as of now, the kernel linux-image-* packages are installed before initramfs-tools, which basically means that the kernel postinst script is unable to trigger the chain of events described above. It seems that there's also nothing in initramfs-tools to "catch up" in such a scenario.

As an immediate measure, i have now introduced a very simple change in our fork of pi-gen: I have moved initramfs-tools from stage0/02-firmware/01-packages into its own *-packages file, and changed the numbering in a way that makes sure that this new file is processed first. This measure has successfully restored the initramfs files and allows our images to boot again.

I am unsure whether this fix is suitable as a more permanent solution, but I can provide a pull request if necessary. If not, I can only hope that somebody else can complete this puzzle, or at least provide additional pointers...

XECDesign commented 7 months ago

Is this not the same issue as this? In which case it should've been fixed by this commit.

jketterl commented 7 months ago

Yeah, looks like it. I never came across that issue, that's probably because the realization that this is related to initramfs came kind of late. I did review the commits on the master branch to see if there would be anything potentially related, but didn't spot the commit for the same reason... I probably should have tried building from the master, but didn't - that's mostly because I'm trying to stick to the equivalent of the latest Pi OS release, or at least as close as possible.

I did even attempt similar fixes as the one in the commit, but couldn't get them to work. I don't normally deal with this stuff...

XECDesign commented 7 months ago

Yeah, it's all a bit new for me as well and it can be quite a rabbit trail to find out what happens where exactly. Great work figuring it out.

Closing, since I am 90% certain that that is the fix.

jketterl commented 7 months ago

Yeah, can confirm: cherry-picking that commit also solves the problem. I'll use that for our builds, mostly for the sake of not diverging. Thank you for the fix, and also for filling in the missing blank about what exactly changed.

I guess the only potential takeaway here is that forcing the package sequence would also work, should the problem pop up again.