lxc / lxc-ci

LXC continuous integration and build scripts
https://jenkins.linuxcontainers.org
Apache License 2.0
265 stars 136 forks source link

Building Ubuntu noble VM from `images/ubuntu.yaml` produces an image that doesn't immediately boot #813

Closed kienanstewart closed 8 months ago

kienanstewart commented 8 months ago

After importing an Ubuntu noble image built using images/ubuntu.yaml, the VM drops into the grub command line.

The following command was used to build the rootfs:

distrobuilder build-incus --vm ubuntu.yaml  -o image.architecture=amd64 -o image.variant=cloud -o image.release=noble noble
$ lxc launch --vm -p default -p ci-node -p ci-rootnode --console 120116d86ae2bed806b1fc465d08ca97364568b54fd8d8582af1294b451e166e
Creating the instance
Instance name is: proud-iguana
Starting proud-iguana
To detach from the console, press: <ctrl>+a q
BdsDxe: loading Boot0001 "UEFI QEMU QEMU HARDDISK " from PciRoot(0x0)/Pci(0x1,0x1)/Pci(0x0,0x0)/Scsi(0x0,0x1)
BdsDxe: starting Boot0001 "UEFI QEMU QEMU HARDDISK " from PciRoot(0x0)/Pci(0x1,0x1)/Pci(0x0,0x0)/Scsi(0x0,0x1)
error: file `/boot/' not found.
error: no such device: /.disk/info.
error: no such device: /.disk/mini-info.

grub> ls
(memdisk) (hd0) (hd0,gpt2) (hd0,gpt1)

grub > probe -u (hd0,1)
D846-81AF

grub > probe -u (hd0,2)
3ccc7d4b-29d8-487d-a6cf-c7fe4926b0af

grub> cat /efi/boot/grub.cfg
search.fs_uuid 3ccc7d4b-29d8-487d-a6cf-c7fe4926b0af root
set prefix=($root)'/boot/grub'
configfile $prefix/grub.cfg

grub> echo $root
hd0,gpt1

grub> echo $prefix
(hd0,gpt1)/boot/grub

grub> ls(hd0,gpt1)/
efi/

grub> search.fs_uuid 3ccc7d4b-29d8-487d-a6cf-c7fe4926b0af
hd0,gpt2

I can manually boot the system from the grub console. It looks to me as if the initial (hd0,1)/efi/boot/grub.cfg wasn't used, based on the state of $root and $prefix.

Distrobuilder: 44efe1e3b3d65bb3b769695ec177f5fd92483e93 lxd-client: 5.0.2+git20231211.1364ae4-3 (Debian sid) lxd: 5.0.2-5 (Debian bookworm)

stgraber commented 8 months ago

Yep, that's why we're not currently building noble images on our image server :)

Someone needs to figure out what changed in Ubuntu between mantic and noble and how to fix the YAML for it.

kienanstewart commented 8 months ago

From initial testing, changing grub-install --uefi-secure-boot --target="${TARGET}-efi" --no-nvram --removable to grub-install --no-uefi-secure-boot --target="${TARGET}-efi" --no-nvram --removable changes something to allow grub to continue without user intervention.

When --uefi-secure-boot is given, the grub-mkimage command is invoked as follows (this can be determined adding the -v flag to grub-install in the post-files action):

grub-install: info: grub-mkimage --directory '/usr/lib/grub/x86_64-efi' --prefix '/boot/grub' --output '/boot/grub/x86_64-efi/core.efi'  --dtb '' --sbat '' --format 'x86_64-efi' --compression 'auto'   --config '/boot/grub/x86_64-efi/load.cfg' 'ext2' 'part_gpt' 'search_fs_uuid' 

When --no-uefi-secure-boot is given, the command is:

grub-install: info: grub-mkimage --directory '/usr/lib/grub/x86_64-efi' --prefix '(,gpt2)/boot/grub' --output '/boot/grub/x86_64-efi/core.efi'  --dtb '' --sbat '' --format 'x86_64-efi' --compression 'auto'   'ext2' 'part_gpt' 
kienanstewart commented 8 months ago

In images/debian.yaml, the post-files action which installs grub runs grub-install twice (https://github.com/lxc/lxc-ci/blob/e5f93e469bdab592d29b10128df79b0a7e5358ad/images/debian.yaml#L1471C1-L1474C72):

    # This will create EFI/BOOT
    grub-install --uefi-secure-boot --target="${TARGET}-efi" --no-nvram --removable
    # This will create EFI/debian
    grub-install --uefi-secure-boot --target="${TARGET}-efi" --no-nvram

As both grub2 in Ubuntu and Debian are carrying the patch for installing signed copies, and the Debian sid is still passing I wondered if the second invocation does something.

The grub-mkimage command invoked when installing with --removable is:

grub-mkimage --directory '/usr/lib/grub/x86_64-efi' --prefix '/boot/grub' --output '/boot/grub/x86_64-efi/core.efi'  --dtb '' --sbat '' --format 'x86_64-efi' --compression 'auto'   --config '/boot/grub/x86_64-efi/load.cfg' 'ext2' 'part_gpt' 'search_fs_uuid'

And without --removable,

grub-mkimage --directory '/usr/lib/grub/x86_64-efi' --prefix '' --output '/boot/grub/x86_64-efi/grub.efi'  --dtb '' --sbat '' --format 'x86_64-efi' --compression 'auto'   --config '/boot/grub/x86_64-efi/load.cfg' 'ext2' 'part_gpt' 'search_fs_uuid' 

Including both grub-install statements results in an image that passes through grub without user intervention.

I also tested using only grub-install --uefi-secure-boot --target="${TARGET}-efi" --no-nvram in the post-files action: this worked and the system booted without user interaction.

kienanstewart commented 8 months ago

@stgraber I see you comitted a change to sync the ubuntu.yaml and debian.yaml. I think it might be worth trying to figure out if using only a single grub-install invocation works across multiple releases. It could be that the original reasons for running it twice are no longer applicable.

stgraber commented 8 months ago

Thanks a lot for the detective work. I just pushed a couple of fixes to align Debian and Ubuntu.

It certainly sounds like calling without --removable would be sufficient, though I think it's still safer to perform both calls as in theory, --removable should install the bootx64.efi and the normal call should install grubx64.efi.

This isn't quite as simple as that since the shim also needs to be installed and is actually what ends up being bootx64.efi, but I can't be sure that both always get installed in all the right places on all versions of Ubuntu and Debian that folks build images for, calling the install twice doesn't really cost much, so let's just do that ;)

kienanstewart commented 8 months ago

Sounds good to me. Thanks for comitting a fix :)