churchers / vm-bhyve

Shell based, minimal dependency bhyve manager
BSD 2-Clause "Simplified" License
847 stars 181 forks source link

SLES VM does not boot after OS install using UEFI loader #414

Open sronsiek opened 3 years ago

sronsiek commented 3 years ago

Hi,

I've been trying to install & run SLES15 on bhyve with mixed success. I'm using a template based on a working openleap configuration, the install seems to work fine and I end up with a running system. However following a vm stop the system no longer boots, and drops into a UEFI interactive shell (which I don't know what to do with). The procedure and as much detail as I can muster are provided below. I've not found anything obvious in the wiki I might be doing wrong.

Is there possibly a bug here - or is it a config issue?

# uname -a
FreeBSD boogaloo 12.0-RELEASE-p3 FreeBSD 12.0-RELEASE-p3 GENERIC  amd64

Relevant Package versions:

grub2-bhyve-0.40_8
uefi-edk2-bhyve-0.2_1,1
vm-bhyve-1.4.2

Template content:

cat /vm/.templates/SLE-15-SP2-Server.conf

loader="uefi"
cpu=2
memory=4G
network0_type="virtio-net"
network0_switch="public"
disk0_type="virtio-blk"
disk0_name="disk0"
disk0_size=40G
disk0_dev="zvol"
graphics="yes"
graphics_res="1280x1024"
xhci_mouse="yes"
virt_random="yes"

# disk1_type="ahci-cd"
# disk1_dev="custom"
# disk1_name="/vm/.iso/SLE-15-SP2-Full-x86_64-GM-Media1.iso"

VM creation:

vm create -t SLE-15-SP2-Server tp-sles15-sp2

# Config edit: uncomment 3 lines mounting iso in /vm/tp-sles15-sp2/tp-sles15-sp2.conf
# then:

vm install tp-sles15-sp2 refind-cd-0.11.4.iso

At this point a VNC client is opened and OS is installed. The procedure completes and we have a running system. It is possible to reboot the VM by issuing a reboot command within the VM.

Disk allocation on host:

zfs list | grep tp-sles15-sp2
zroot/vm/tp-sles15-sp2                    42.4G  1.68T   400K /vm/tp-sles15-sp2
zroot/vm/tp-sles15-sp2/disk0              42.4G  1.72T  1.23G  -

Disk config within the VM:

localhost:~ # fdisk -sl
Disk /dev/vda: 40 GiB, 42949672960 bytes, 83886080 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 8192 bytes
I/O size (minimum/optimal): 8192 bytes / 8192 bytes
Disklabel type: gpt
Disk identifier: DA84AAC5-E83D-463D-8B40-3B9743D5EB7B

Device        Start      End  Sectors  Size Type
/dev/vda1      2048  1026047  1024000  500M EFI System
/dev/vda2   1026048 50112511 49086464 23.4G Linux filesystem
/dev/vda3  50112512 75843583 25731072 12.3G Linux filesystem
/dev/vda4  75843584 83886046  8042463  3.9G Linux swap
localhost:~ # lsblk
NAME   MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
sr0     11:0    1  9.9G  0 rom
vda    254:0    0   40G  0 disk
├─vda1 254:1    0  500M  0 part /boot/efi
├─vda2 254:2    0 23.4G  0 part /
├─vda3 254:3    0 12.3G  0 part /home
└─vda4 254:4    0  3.9G  0 part [SWAP]

Following a vm stop at this point, a subsequent vm start no longer starts the VM. The behaviour seen is:

  1. With unchanged configuration, the following msg is seen in VNC / console:
     Boot Failed. EFI DVD/CDROM
     Failed to set MokListRT: Invalid Parameter
     Something has gone seriously wrong: import_mok_state() failed
     : Invalid Parameter

vm-bhyve.log:

Apr 16 14:25:16: initialising
Apr 16 14:25:16:  [loader: uefi]
Apr 16 14:25:16:  [cpu: 1]
Apr 16 14:25:16:  [memory: 4G]
Apr 16 14:25:16:  [hostbridge: standard]
Apr 16 14:25:16:  [com ports: com1]
Apr 16 14:25:16:  [uuid: 6a675349-5f3b-11eb-8e9c-bcee7b5d5740]
Apr 16 14:25:16:  [utctime: yes]
Apr 16 14:25:16:  [debug mode: no]
Apr 16 14:25:16:  [primary disk: disk0]
Apr 16 14:25:16:  [primary disk dev: zvol]
Apr 16 14:25:16: initialising network device tap0
Apr 16 14:25:16: adding tap0 -> vm-public (public addm)
Apr 16 14:25:16: bring up tap0 -> vm-public (public addm)
Apr 16 14:25:16: dynamically allocated port 5902 for vnc connections
Apr 16 14:25:16: booting
Apr 16 14:25:16:  [bhyve options: -c 1 -m 4G -Hwl bootrom,/usr/local/share/uefi-firmware/BHYVE_UEFI.fd -U 6a675349-5f3b-11eb-8e9c-bcee7b5d5740 -u]
Apr 16 14:25:16:  [bhyve devices: -s 0,hostbridge -s 31,lpc -s 4:0,virtio-blk,/dev/zvol/zroot/vm/tp-sles15-sp2/disk0 -s 5:0,ahci-cd,/vm/.iso/SLE-15-SP2-Full-x86_64-GM-Media1.iso -s 6:0,virtio-net,tap0,mac=58:9c:fc:01:c2:e9 -s 7:0,virtio-rnd -s 8:0,fbuf,tcp=0.0.0.0:5902,w=1280,h=1024 -s 9:0,xhci,tablet]
Apr 16 14:25:16:  [bhyve console: -l com1,/dev/nmdm-tp-sles15-sp2.1A]
Apr 16 14:25:16:  [bhyve iso device: -s 3:0,ahci-cd,/vm/.config/null.iso]
Apr 16 14:25:16: starting bhyve (run 1)
  1. Although the iso mount is needed for subsequent repo availability, I now commented the 3 disk1 lines in the config just to workaround the previous error. The following behaviour is then seen:
   Boot Failed. EFI DVD/CDROM
   Boot Failed. EFI Misc Device
   .

The after a timeout of ~60 seconds:

   UEFI Interactive Shell v2.1
   EDK II
   UEFI v2.40 (BHYVE, 0x00010000)
   Mapping table
      FS0: Alias(s):HD9b:;BLK2:
PciRoot(0x0)/Pci(0x4,0x0)/HD(1,GPT,696EA160-87CE-401E-A0C4-E5A83242A57D,0x800,0xFA000)
     BLK0: Alias(s):
          PciRoot(0x0)/Pci(0x3,0x0)/Sata(0x0,0x0,0x0)
     BLK1: Alias(s):
          PciRoot(0x0)/Pci(0x4,0x0)
     BLK3: Alias(s):
PciRoot(0x0)/Pci(0x4,0x0)/HD(2,GPT,C73502B7-5896-4B2C-9540-C476C4C63812,0xFA800,0x2ED0000)
     BLK4: Alias(s):
PciRoot(0x0)/Pci(0x4,0x0)/HD(3,GPT,3270A07A-ADBD-4E81-AEB8-DA70A1F256DD,0x2FCA800,0x188A000)
     BLK5: Alias(s):
PciRoot(0x0)/Pci(0x4,0x0)/HD(4,GPT,FE6CB526-A84F-44AF-BF42-02AE58D90560,0x4854800,0x7AB7DF)

     Press ESC in 5 seconds to skip startup.nsh or any other key to continue.
     Shell>

vm-bhyve.log:

Apr 16 14:26:15: initialising
Apr 16 14:26:15:  [loader: uefi]
Apr 16 14:26:15:  [cpu: 1]
Apr 16 14:26:15:  [memory: 4G]
Apr 16 14:26:15:  [hostbridge: standard]
Apr 16 14:26:15:  [com ports: com1]
Apr 16 14:26:15:  [uuid: 6a675349-5f3b-11eb-8e9c-bcee7b5d5740]
Apr 16 14:26:15:  [utctime: yes]
Apr 16 14:26:15:  [debug mode: no]
Apr 16 14:26:15:  [primary disk: disk0]
Apr 16 14:26:15:  [primary disk dev: zvol]
Apr 16 14:26:15: initialising network device tap0
Apr 16 14:26:15: adding tap0 -> vm-public (public addm)
Apr 16 14:26:15: bring up tap0 -> vm-public (public addm)
Apr 16 14:26:15: dynamically allocated port 5902 for vnc connections
Apr 16 14:26:15: booting
Apr 16 14:26:15:  [bhyve options: -c 1 -m 4G -Hwl bootrom,/usr/local/share/uefi-firmware/BHYVE_UEFI.fd -U 6a675349-5f3b-11eb-8e9c-bcee7b5d5740 -u]
Apr 16 14:26:15:  [bhyve devices: -s 0,hostbridge -s 31,lpc -s 4:0,virtio-blk,/dev/zvol/zroot/vm/tp-sles15-sp2/disk0 -s 5:0,virtio-net,tap0,mac=58:9c:fc:01:c2:e9 -s 6:0,virtio-rnd -s 7:0,fbuf,tcp=0.0.0.0:5902,w=1280,h=1024 -s 8:0,xhci,tablet]
Apr 16 14:26:15:  [bhyve console: -l com1,/dev/nmdm-tp-sles15-sp2.1A]
Apr 16 14:26:15:  [bhyve iso device: -s 3:0,ahci-cd,/vm/.config/null.iso]
Apr 16 14:26:15: starting bhyve (run 1)
churchers commented 3 years ago

Chances are this is the common problem of the Linux distro using a custom path for the EFI loader, then using EFI variables to point at it. These variables are lost when bhyve exits, so the UEFI firmware just looks for /BOOT/EFI/BOOTX64.EFI instead.

We've been waiting a long time to get support for saving EFI variable state.

sronsiek commented 3 years ago

Not sure who you're waiting on ... but:

Is there a workaround for this?

Is it perhaps possible to configure the path to, and filename as per the grub loader?

handcode commented 3 years ago

If .efi filename is your problem try this as workaround: https://github.com/churchers/vm-bhyve/issues/336#issuecomment-565768120

churchers commented 3 years ago

The bhyve hypervisor devs have apparently been working for a while to implement persistent storage for the UEFI firmware. A real system would have some non volatile storage so that UEFI settings can be set once, and are still there after a poweroff. bhyve basically does the equivalent of a "reset bios to defaults" on every boot. Seems to have stagnated though. I think it's complicated by the fact that the firmware is quite behind the upstream so it has become part of a much larger job to get the entire firmware updated.

As far as I'm aware, it should be possible to workaround by mounting the EFI partition and moving the linux/grub bootloader to /BOOT/EFI/BOOTX64.EFI, which is where UEFI will look by default if there's no EFI variable to tell it to look somewhere else. I think this can be done from the UEFI cli, although I can't really see why it couldn't be done by mounting the msdos EFI partition directly on the host and doing it in FreeBSD. I've never tried this though.

sronsiek commented 3 years ago

That's excellent feedback, thank you. The following fixes the UEFI for case 2. (no iso mounted):

cd /boot/efi/EFI
mkdir boot
cp -p sles/grubx64.efi boot/bootx64.efi

My final problem then is the inability to boot when I re-introduce a DVD/CD drive:

     Boot Failed. EFI DVD/CDROM
     Failed to set MokListRT: Invalid Parameter
     Something has gone seriously wrong: import_mok_state() failed
     : Invalid Parameter

Inspection of the boot order shows that DVD is tried first. Changing this is not possible as this is stored in non-volatile in real systems ... SELinux is disabled. Is a workaround known for this - (perhaps to remove CD/DVD from boot options) ? I've seen posts saying to cp bootx64.efi to shimx64.efi but this does not work.