lf-edge / eve

EVE is Edge Virtualization Engine
https://www.lfedge.org/projects/eve/
Apache License 2.0
474 stars 162 forks source link

arm64 installer issue: files are missing at the EFI partition of the destination storage device #4288

Closed rene closed 1 month ago

rene commented 1 month ago

Description

When installing EVE on an ARM device, typically the following files are expected to be on the EFI partition:

bcm2711-rpi-4-b.dtb
bcm2711-rpi-cm4.dtb
boot
config.txt
EFI
fixup4.dat
fr201.txt
overlays
start4.elf
startup.nsh
u-boot.bin
ubootefi.var
UsbInvocationScript.txt

However, only the following files are available after the installation:

cmdline
dtb
EFI
ubootefi.var

This issue causes RPi based devices to fail to boot after installation.

How to reproduce

  1. Generate a installer-raw image
  2. Flash to a USB Stick
  3. Try to install on a RPi based device

Affected versions

Additional context

Thanks @dpoulios to first observe this issue on a RPi based device.

deitch commented 1 month ago

The EFI System Partition is created and populated by make-raw, which is part of pkg/mkimage-raw-efi. It is executed, passing it a list of partitions to create, e.g. make-raw disk.img efi installer imga etc.

For each partition, there is a function do_<part>, e.g. do_efi or do_imga. That function is responsible for creating the partition and populating it.

In the case of building the installer image, it calls it with efi installer (and some others). There is a line that sees, "hey, we are building an EFI for something that will be an installer, I will add some extra stuff:

if echo "$PARTS" | grep -q "efi" && echo "$PARTS" | grep -q "installer"; then
  PARTS=$(echo "$PARTS" | sed -e 's/efi/efiinstaller/')
fi

Instead of calling do_efi and do_installer, it calls do_efiinstaller and do_installer. We will ignore the do_installer for now, as that works fine.

do_efi does the right thing and builds the EFI System Partition, populating it with the contents of /parts/boot (will explain that in a moment).

do_efiinstaller calls do_efi, but also copies some installer stuff into the EFI partition:

The contents of /parts/boot are taken from whatever is passed to make-efi script, or mounted into the container:

All of this is to set the background. I see two issues:

  1. The do_efiinstaller is mixing in both installer things that should only be there when building an installer, and things that should be there for all images that meet a certain requirement. These need to be split.
  2. The remaining pieces likely are from an error in the linked lines in install. Either it is not finding the right things, or they are not there. Perhaps even the installer should take them from its own ESP?
deitch commented 1 month ago

More context. I build an arm64 installer image, as suggested by @rene . It has everything in the right place in the ESP.

If I look at the installer.tar, here are the contents of EFI/:

drwxr-xr-x  0 0      0           0 Sep 19 14:27 EFI
drwxr-xr-x  0 0      0           0 Sep 19 14:36 EFI/BOOT
-rw-rw-r--  0 1000   1001    21172 Sep 19 14:23 EFI/BOOT/grub.cfg
-rw-r--r--  0 0      0     1060864 Sep 19 14:36 EFI/BOOT/BOOTAA64.EFI

None of the other stuff. Which means that it is make-raw that is installing that, and not in the arm64 installer.tar.

I next ran makeflash.sh, the last stage of making the actual installer.raw, with shell debugging. Notably, it mounts $(BUILD_DIR)/installer/:/parts/. Here are the contents of `$(BUILD_DIR)/installer/:

-rw-r--r-- 1 ubuntu ubuntu   37048 Sep 10 07:18 bcm2711-rpi-4-b.dtb
-rwxr-xr-x 1 ubuntu ubuntu    1666 Sep 10 07:17 config.txt
-rw------- 1 ubuntu ubuntu    5411 Sep 10 07:17 fixup4.dat
drwxr-xr-x 2 ubuntu ubuntu    4096 Sep 10 07:17 overlays
-rw------- 1 ubuntu ubuntu 2241440 Sep 10 07:17 start4.elf
-rwxr-xr-x 1 ubuntu ubuntu       9 Sep 10 07:17 startup.nsh
-rwxr-xr-x 1 ubuntu ubuntu  707776 Sep 10 07:18 u-boot.bin

That looks like most of the stuff you were looking for. It was created as part of the make installer-raw target, specifically $(BOOT_PART), which runs in the Makefile:

    $(LINUXKIT) pkg build --pull --platforms linux/$(ZARCH) pkg/$(PKG) # running linuxkit pkg build _without_ force ensures that we either pull it down or build it.
    cd $(dir $@) && $(LINUXKIT) cache export --arch $(DOCKER_ARCH_TAG) --format filesystem --outfile - $(shell $(LINUXKIT) pkg show-tag pkg/$(PKG)) | tar xvf - $(notdir $@)

It populates all of that in $(BUILD_DIR)/installer/boot.

Here are the important lines in the debug output of makeflash.sh:

+ cp -r /parts/boot/bcm2711-rpi-4-b.dtb /parts/boot/config.txt /parts/boot/fixup4.dat /parts/boot/overlays /parts/boot/start4.elf /parts/boot/startup.nsh /parts/boot/u-boot.bin /efifs/

That actually is not even part of the do_efiinstaller, just part of the regular do_efi.

Conclusions:

  1. The desired contents are not part of the installer.tar.
  2. The desired contents were not placed by the "installer-special" do_efiinstaller in make-raw, but were found in the $(BUILD_DIR)/installer/boot directory, which make-raw placed in ESP automatically.
  3. When running the install, those parts then are not available in the installer partition, since they were not in installer.tar but rather placed later in ESP.

I see a few possible solutions:

  1. Include the necessary files in the installer.tar. This would make the Makefile a tad cleaner and easier, and ensure the files are there both for building the installer image and when running the installer image to install on a target disk
  2. Leave it as is, but change the install script to look for the files from the ESP. I don't like this as much, as an installer ESP might or might not look like the final target ESP.
  3. Place the files inside an .img, just like we do for rootfs.img and config.img, etc., place them on the root of the installer image and mount into the installer. This seems repetitive and confusing, as we actually do need the files in the ESP of the installer, unlike those, with are explicitly bits to be copied as-is to target partitions.

I think that if we can do the first, that is the best option.

Thoughts?

deitch commented 1 month ago

@rene I see that the files are in 2 places:

Everything in /boot/ is also in / of the ESP. Does it also need to be in /boot/? I don't think it affects anything there. The comments on the original code there said it was put there for later use by the installer. Which, we know, is the issue we are trying to resolve, but separately, do these files need to be in ESP:/boot/?

$ ls /mnt
EFI  UsbInvocationScript.txt  bcm2711-rpi-4-b.dtb  boot  config.txt  fixup4.dat  overlays  start4.elf  startup.nsh  u-boot.bin
$ ls /mnt/boot
bcm2711-rpi-4-b.dtb  config.txt  fixup4.dat  overlays  start4.elf  startup.nsh  u-boot.bin
deitch commented 1 month ago

I think that if we can do the first, that is the best option.

That ended up being too hard. It required retooling of make-raw, and all of its dependencies. It is not a bad idea, but too big for this context. I will open a PR to make those bits available in the right place in the installer image, so that the install script and make-raw have access to them.

We should do another phase of installer restructure to simplify it, after the previous big one to change how it is built. Next.