Closed ltrager closed 3 years ago
Does starting a normal LXD Ubuntu VM outside of MAAS with secureboot enabled works for you?
If so, you should look at edibootmgr and the files in /boot/efi to see what differs.
I just validated that a hirsute system with:
root@v-latest:~# dpkg -l | grep shim-signed
ii shim-signed 1.47+15.4-0ubuntu2 amd64 Secure Boot chain-loading bootloader (Microsoft-signed binary)
root@v-latest:~# dpkg -l | grep grub-efi
ii grub-efi-amd64 2.04-1ubuntu45 amd64 GRand Unified Bootloader, version 2 (EFI-AMD64 version)
ii grub-efi-amd64-bin 2.04-1ubuntu45 amd64 GRand Unified Bootloader, version 2 (EFI-AMD64 modules)
ii grub-efi-amd64-signed 1.169+2.04-1ubuntu45 amd64 GRand Unified Bootloader, version 2 (EFI-AMD64 version, signed)
Boots without any issue here.
Trying to use those binaries through MAAS here, I always get to a grub prompt, manually running configfile /grub/grub.cfg
gets it going, not sure why it's not doing that automatically in the first place...
Post install, I'm seeing a hang at:
Booting local disk...
Failed to open \efi\boot\grubx64.efi - Not Found
Failed to load image \efi\boot\grubx64.efi: Not Found
start_image() returned Not Found
EFI stub: UEFI Secure Boot is enabled.
Then looking at qemu's log, we're seeing a qemu/kvm crash:
KVM internal error. Suberror: 1
emulation failure
RAX=000000007ffa0080 RBX=000000007ff82b10 RCX=000000007e900e18 RDX=00000000000088bd
RSI=000000003da5c94d RDI=000000003ffff1c4 RBP=8000000000000001 RSP=000000007ff829f8
R8 =0000000000000028 R9 =000000007f8ee9e7 R10=000000007ff64000 R11=0000000000000000
R12=0000000000000000 R13=000000007e900e18 R14=000000003da56a2c R15=000000007ff82b00
RIP=00000000000affff RFL=00010a07 [-O---PC] CPL=0 II=0 A20=1 SMM=0 HLT=0
ES =0030 0000000000000000 ffffffff 00c09300 DPL=0 DS [-WA]
CS =0038 0000000000000000 ffffffff 00a09b00 DPL=0 CS64 [-RA]
SS =0030 0000000000000000 ffffffff 00c09300 DPL=0 DS [-WA]
DS =0030 0000000000000000 ffffffff 00c09300 DPL=0 DS [-WA]
FS =0030 0000000000000000 ffffffff 00c09300 DPL=0 DS [-WA]
GS =0030 0000000000000000 ffffffff 00c09300 DPL=0 DS [-WA]
LDT=0000 0000000000000000 0000ffff 00008200 DPL=0 LDT
TR =0000 0000000000000000 0000ffff 00008b00 DPL=0 TSS64-busy
GDT= 000000007f9ee698 00000047
IDT= 000000007f3da018 00000fff
CR0=80010033 CR2=0000000000000000 CR3=000000007fc01000 CR4=00000668
DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000
DR6=00000000ffff0ff0 DR7=0000000000000400
EFER=0000000000000d00
Code=00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 <00> ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
Manually booting the VM works fine (ESC to enter firmware, use boot menu and select the ubuntu entry). So this suggests that something in grub's networked codepath is tickling firmware/qemu the wrong way causing a full crash of the emulator.
@ltrager were you testing with the latest edk2? LXD uses 2021.02 which roughly matches what's in impish. You running an older edk2 may explain a difference in behavior.
The concern about -no-reboot
came about in LP:1865515. My understanding is it may be causing the chain of trust to be reset. Dimitri can explain how -no-reboot
may be causing this better than I can. We have been able to verify that the latest SHIM and GRUB do properly support chainloading with the chain of trust preserved on physical hardware as well as libvirt.
To answer your question my libvirt testing was done on Ubuntu 20.10 using packages from main and updates.
ovmf - 2020.05-5ubuntu0.2 qemu - 1:5.0-5ubuntu9.8 libvirt - 6.6.0-1ubuntu3.5
no-reboot
causes qemu to exit when a full system reboot is triggered, that's it. A UEFI fallback to the next boot entry does not trigger it, otherwise we'd never get things booting :)
Given Ubuntu 21.04 as deployed by MAAS boots just fine when you ask the firmware to directly boot from the ubuntu
efi boot entry, the actual secureboot setup looks fine. Though the fact that grub always dumps me to a shell requiring me to type in configfile /grub/grub.cfg
to get things going suggests there may be some weirdness in MAAS or grub there.
It'd be good for you to test with ovmf 2021.02 so we're dealing with a closer comparison. You also should be testing with virtio-net and not e1000e so that the firmware deals with the same network card in both cases.
As suggested I installed Ubuntu 21.10 Impish with ovmf 2021.02-1, qemu 1:5.2+dfsg-9ubuntu3, and libvirt 7.0.0-2ubuntu2 and created a VM using virtio net and virtio storage. MAAS is able to deploy Ubuntu with SecureBoot enabled. There are no problems chainloading while keeping the chain of trust.
Is there an easy way to test Dimitri's theory that -no-reboot
is causing the issue?
Sure if that makes you feel better...
root@athos:~# lxc start maas-vm01
root@athos:~# ps aux | grep maas-vm01
root 1752609 0.0 0.0 79892 3396 ? Ssl 20:21 0:00 /snap/lxd/20383/bin/virtiofsd --socket-path=/var/snap/lxd/common/lxd/logs/maas-vm01/virtio-fs.config.sock -o source=/var/snap/lxd/common/lxd/virtual-machines/maas-vm01/config
lxd 1752898 86.0 0.0 9327240 94736 ? Sl 20:21 0:01 /snap/lxd/20383/bin/qemu-system-x86_64 -S -name maas-vm01 -uuid 8efadf6a-be79-460a-9b58-7090da696005 -daemonize -cpu host -nographic -serial chardev:console -nodefaults -no-user-config -sandbox on,obsolete=deny,elevateprivileges=allow,spawn=deny,resourcecontrol=deny -readconfig /var/snap/lxd/common/lxd/logs/maas-vm01/qemu.conf -spice unix=on,disable-ticketing=on,addr=/var/snap/lxd/common/lxd/logs/maas-vm01/qemu.spice -pidfile /var/snap/lxd/common/lxd/logs/maas-vm01/qemu.pid -D /var/snap/lxd/common/lxd/logs/maas-vm01/qemu.log -chroot /var/snap/lxd/common/lxd/virtual-machines/maas-vm01 -smbios type=2,manufacturer=Canonical Ltd.,product=LXD -runas lxd
root 1752958 0.0 0.0 79892 2036 ? Sl 20:21 0:00 /snap/lxd/20383/bin/virtiofsd --socket-path=/var/snap/lxd/common/lxd/logs/maas-vm01/virtio-fs.config.sock -o source=/var/snap/lxd/common/lxd/virtual-machines/maas-vm01/config
root 1753064 0.0 0.0 9040 2472 pts/0 S+ 20:21 0:00 grep --color=auto maas-vm01
root@athos:~# lxc console maas-vm01
To detach from the console, press: <ctrl>+a q
>>Start HTTP Boot over IPv4....
Station IP address is 172.17.16.80
URI: http://172.17.16.10:5248/images/bootx64.efi
File Size: 955072 Bytes
Downloading...100%BdsDxe: loading Boot0005 "UEFI HTTPv4 (MAC:00163ED0BF0C)" from PciRoot(0x0)/Pci(0x1,0x4)/Pci(0x0,0x0)/MAC(00163ED0BF0C,0x1)/IPv4(0.0.0.0,0x0,DHCP,0.0.0.0,0.0.0.0,0.0.0.0)/Uri()
BdsDxe: starting Boot0005 "UEFI HTTPv4 (MAC:00163ED0BF0C)" from PciRoot(0x0)/Pci(0x1,0x4)/Pci(0x0,0x0)/MAC(00163ED0BF0C,0x1)/IPv4(0.0.0.0,0x0,DHCP,0.0.0.0,0.0.0.0,0.0.0.0)/Uri()
error: file `/EFI/ubuntu/x86_64-efi/command.lst' not found.
error: file `/EFI/ubuntu/x86_64-efi/fs.lst' not found.
error: file `/EFI/ubuntu/x86_64-efi/crypto.lst' not found.
error: file `/EFI/ubuntu/x86_64-efi/terminal.lst' not found.
error: file `/EFI/ubuntu/grub.cfg' not found.
GNU GRUB version 2.04
Minimal BASH-like line editing is supported. For the first word, TAB
lists possible command completions. Anywhere else TAB lists possible
device or file completions.
grub>
Booting local disk...
Failed to open \efi\boot\grubx64.efi - Not Found
Failed to load image \efi\boot\grubx64.efi: Not Found
start_image() returned Not Found
EFI stub: UEFI Secure Boot is enabled.
root@athos:~# cat /var/snap/lxd/common/lxd/logs/maas-vm01/qemu.log
KVM internal error. Suberror: 1
emulation failure
RAX=000000007ffa0080 RBX=000000007ff82b40 RCX=000000007e8f4f98 RDX=00000000000054fa
RSI=000000003da5c94d RDI=000000003ffff1c4 RBP=8000000000000001 RSP=000000007ff82a28
R8 =0000000000000028 R9 =000000007f8ee9e7 R10=000000007ff64000 R11=0000000000000000
R12=0000000000000000 R13=000000007e8f4f98 R14=000000003da56a2c R15=000000007ff82b30
RIP=00000000000affff RFL=00010a07 [-O---PC] CPL=0 II=0 A20=1 SMM=0 HLT=0
ES =0030 0000000000000000 ffffffff 00c09300 DPL=0 DS [-WA]
CS =0038 0000000000000000 ffffffff 00a09b00 DPL=0 CS64 [-RA]
SS =0030 0000000000000000 ffffffff 00c09300 DPL=0 DS [-WA]
DS =0030 0000000000000000 ffffffff 00c09300 DPL=0 DS [-WA]
FS =0030 0000000000000000 ffffffff 00c09300 DPL=0 DS [-WA]
GS =0030 0000000000000000 ffffffff 00c09300 DPL=0 DS [-WA]
LDT=0000 0000000000000000 0000ffff 00008200 DPL=0 LDT
TR =0000 0000000000000000 0000ffff 00008b00 DPL=0 TSS64-busy
GDT= 000000007f9ee698 00000047
IDT= 000000007f3da018 00000fff
CR0=80010033 CR2=0000000000000000 CR3=000000007fc01000 CR4=00000668
DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000
DR6=00000000ffff0ff0 DR7=0000000000000400
EFER=0000000000000d00
Code=00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 <00> ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
root@athos:~#
So this obviously has absolutely nothing to do with no-reboot.
Using good old print debugging, I've narrowed this down to the kernel calling exit_boot_services()
here.
Here's the OVMF debug output leading up to it the crash:
[Bds]Booting UEFI Misc Device
BlockSize : 512
LastBlock : FFFFFF
Valid efi partition table header
Valid efi partition table header
Valid primary and Valid backup partition table
Partition entries read block success
Number of partition entries: 128
start check partition entries
End check partition entries
Index : 0
Start LBA : 37800
End LBA : 465FDE
Partition size: 42E7DF
Start : 6F00000 End : 8CBFBC00
Index : 13
Start LBA : 800
End LBA : 27FF
Partition size: 2000
Start : 100000 End : 4FFE00
Index : 14
Start LBA : 2800
End LBA : 377FF
Partition size: 35000
Start : 500000 End : 6EFFE00
Prepare to Free Pool
BlockSize : 512
LastBlock : 42E7DE
BlockSize : 512
LastBlock : 1FFF
FSOpen: Open '\EFI\BOOT\BOOTX64.EFI' Success
[Bds] Expand PciRoot(0x0)/Pci(0x1,0x2)/Pci(0x0,0x0) -> PciRoot(0x0)/Pci(0x1,0x2)/Pci(0x0,0x0)/HD(15,GPT,B892B9DA-959A-4C13-9B9C-EDC1C192A820,0x2800,0x35000)/\EFI\BOOT\BOOTX64.EFI
[Security] 3rd party image[0] can be loaded after EndOfDxe: PciRoot(0x0)/Pci(0x1,0x2)/Pci(0x0,0x0)/HD(15,GPT,B892B9DA-959A-4C13-9B9C-EDC1C192A820,0x2800,0x35000)/\EFI\BOOT\BOOTX64.EFI.
DxeImageVerification: MeasureVariable (Pcr - 7, EventType - 800000E0, VariableName - db, VendorGuid - D719B2CB-3D3A-4596-A3BC-DAD00E67656F)
MeasureBootPolicyVariable - Not Found
InstallProtocolInterface: 5B1B31A1-9562-11D2-8E3F-00A0C969723B 7CC53340
Loading driver at 0x0007CB61000 EntryPoint=0x0007CB83000
InstallProtocolInterface: BC62157E-3E33-4FEC-9920-2D3B36D750DF 7D304998
ProtectUefiImageCommon - 0x7CC53340
- 0x000000007CB61000 - 0x00000000000D3000
[Variable]: Rewritten a preexisting variable(0x00000006) with different attributes(0x00000007) - 605DAB50-E046-4300-ABB6-3DD810DD8B23:MokListRT
[Variable]: Rewritten a preexisting variable(0x00000006) with different attributes(0x00000007) - 605DAB50-E046-4300-ABB6-3DD810DD8B23:MokListXRT
[Variable]: Rewritten a preexisting variable(0x00000006) with different attributes(0x00000007) - 605DAB50-E046-4300-ABB6-3DD810DD8B23:SbatLevelRT
InstallProtocolInterface: 605DAB50-E046-4300-ABB6-3DD810DD8B23 7CC10160
FSOpen: Open '\EFI\BOOT\fbx64.efi' Success
FSOpen: Open '\EFI\BOOT\fbx64.efi' Success
FSOpen: Open 'EFI' Success
FSOpen: Open 'ubuntu' Success
FSOpen: Open 'BOOTX64.CSV' Success
FSOpen: Open '\EFI\ubuntu\BOOTX64.CSV' Success
FSOpen: Open '\EFI\ubuntu\shimx64.efi' Success
[Security] 3rd party image[0] can be loaded after EndOfDxe: PciRoot(0x0)/Pci(0x1,0x2)/Pci(0x0,0x0)/HD(15,GPT,B892B9DA-959A-4C13-9B9C-EDC1C192A820,0x2800,0x35000)/\EFI\ubuntu\shimx64.efi.
DxeImageVerification: MeasureVariable (Pcr - 7, EventType - 800000E0, VariableName - db, VendorGuid - D719B2CB-3D3A-4596-A3BC-DAD00E67656F)
MeasureBootPolicyVariable - Not Found
InstallProtocolInterface: 5B1B31A1-9562-11D2-8E3F-00A0C969723B 7CCCA040
Loading driver at 0x0007C9A4000 EntryPoint=0x0007C9C6000
InstallProtocolInterface: BC62157E-3E33-4FEC-9920-2D3B36D750DF 7CC4A818
ProtectUefiImageCommon - 0x7CCCA040
- 0x000000007C9A4000 - 0x00000000000D3000
[Variable]: Rewritten a preexisting variable(0x00000006) with different attributes(0x00000007) - 605DAB50-E046-4300-ABB6-3DD810DD8B23:MokListRT
[Variable]: Rewritten a preexisting variable(0x00000006) with different attributes(0x00000007) - 605DAB50-E046-4300-ABB6-3DD810DD8B23:MokListXRT
[Variable]: Rewritten a preexisting variable(0x00000006) with different attributes(0x00000007) - 605DAB50-E046-4300-ABB6-3DD810DD8B23:SbatLevelRT
InstallProtocolInterface: 605DAB50-E046-4300-ABB6-3DD810DD8B23 7CA53160
FSOpen: Open '\EFI\ubuntu\grubx64.efi' Success
ConvertPages: range 1000000 - 43A4FFF covers multiple entries
ConvertPages: range 1000000 - 43A4FFF covers multiple entries
Ok, so at least it's a firmware call that crashes everything, so slightly less concerning than just the kernel doing it during normal initialization.
Is there any way to get debugging on the firmware side when it handles ExitBootServices?
Given it only happens on first boot (or at least looked that way here) and that's when we setup the boot NVRAM entries, I suspect it's got something to do with ExitBootServices applying the variable restrictions (so you can't write to the firmware-only ones)?
I'm hoping to learn more tomorrow by attaching gdb
to the VM.
I hooked up gdb
to the VM and was able to load the ovmf
symbols. I setup a breakpoint for CoreExitBootServices()
, but the crash happened before that breakpoint. So, best I can tell we're crashing somewhere in the glue (thunk?) between the EFI stub and the firmware.
Since the address to the invalid instruction always seems to be the same (0xff96
), I set a break there instead. I /did/ hit that breakpoint after the EFI stub's call to ExitBootServices()
and before the Invalid Opcode Exception. This allowed me to collect the following stacktrace, though I don't know how to associate it with the debug symbols from the stub:
(gdb) bt
#0 0x000000000000ff96 in ?? ()
#1 0x000000003da40cbf in ?? ()
#2 0x8000000000000001 in ?? ()
#3 0x0000000000000000 in ?? ()
GRUB debug tells me it loaded the kernel at nearby/lower address:
loader/efi/linux.c:96: kernel_addr: 0x3cc55000 handover_offset: 0x190 params:
0x3ffff000
So it seems plausible that we're still executing somewhere in the stub. And the instruction before the return address in the stack 0x000000003da40cbf
in memory does happen to be a call
:
0x3da40caf: sub $0x20,%rsp
0x3da40cb3: mov (%rax),%rdx
0x3da40cb6: mov %r13,%rcx
0x3da40cb9: call *0xe8(%rsi) <<< This must be where things are going bad
0x3da40cbf: mov %rax,%r12 <<< Because this is the return address
0x3da40cc2: add $0x20,%rsp
0x3da40cc6: jmp 0x3da40ce1
0x3da40cc8: mov 0x3c(%rdx),%edx
0x3da40ccb: mov %r13,%rsi
0x3da40cce: mov 0x80(%rdx),%edi
0x3da40cd4: mov (%rax),%rdx
0x3da40cd7: xor %eax,%eax
0x3da40cd9: call 0x3da402e0
0x3da40cde: mov %rax,%r12
0x3da40ce1: mov %r12,%rsi
0x3da40ce4: lea 0x9018(%rip),%rdi # 0x3da49d03
0x3da40ceb: xor %eax,%eax
0x3da40ced: call 0x3da40527
0x3da40cf2: movabs $0x8000000000000002,%rax
I can find this same code in the disassembly of the EFI stub:
1debaaf: 48 83 ec 20 sub $0x20,%rsp
1debab3: 48 8b 10 mov (%rax),%rdx
1debab6: 4c 89 e9 mov %r13,%rcx
1debab9: ff 96 e8 00 00 00 call *0xe8(%rsi)
1debabf: 49 89 c4 mov %rax,%r12
1debac2: 48 83 c4 20 add $0x20,%rsp
1debac6: eb 19 jmp 0x1debae1
1debac8: 8b 52 3c mov 0x3c(%rdx),%edx
1debacb: 4c 89 ee mov %r13,%rsi
1debace: 8b ba 80 00 00 00 mov 0x80(%rdx),%edi
1debad4: 48 8b 10 mov (%rax),%rdx
1debad7: 31 c0 xor %eax,%eax
1debad9: e8 02 f6 ff ff call 0x1deb0e0
1debade: 49 89 c4 mov %rax,%r12
1debae1: 4c 89 e6 mov %r12,%rsi
1debae4: 48 8d 3d 18 90 00 00 lea 0x9018(%rip),%rdi # 0x1df4b03
1debaeb: 31 c0 xor %eax,%eax
1debaed: e8 35 f8 ff ff call 0x1deb327
1debaf2: 48 b8 02 00 00 00 00 movabs $0x8000000000000002,%rax
And finally, I was able to locate the same code in the disassembly of the efi-stub-helper.o
object in my build tree, which has debug symbols that show this code to be in efi_exit_boot_services()
. So it doesn't seem like we can confidently point the finger at OVMF just yet - though I'm not entirely sure how to get to root cause.
Here's the disassembly of efi_exit_boot_services()
in it's entirety - I believe the crashing line is at 0x8fa:
00000000000007f4 <efi_exit_boot_services>:
7f4: f3 0f 1e fa endbr64
7f8: 41 57 push %r15
7fa: 31 c0 xor %eax,%eax
7fc: 49 89 d7 mov %rdx,%r15
7ff: 41 56 push %r14
801: 49 89 ce mov %rcx,%r14
804: 41 55 push %r13
806: 49 89 fd mov %rdi,%r13
809: 48 8d 3d 00 00 00 00 lea 0x0(%rip),%rdi # 810 <efi_exit_boot_services+0x1c>
810: 41 54 push %r12
812: 55 push %rbp
813: 53 push %rbx
814: 48 89 f3 mov %rsi,%rbx
817: 41 50 push %r8
819: e8 00 00 00 00 call 81e <efi_exit_boot_services+0x2a>
81e: 48 89 df mov %rbx,%rdi
821: e8 00 00 00 00 call 826 <efi_exit_boot_services+0x32>
826: 48 8d 3d 00 00 00 00 lea 0x0(%rip),%rdi # 82d <efi_exit_boot_services+0x39>
82d: 49 89 c4 mov %rax,%r12
830: 48 89 c6 mov %rax,%rsi
833: 31 c0 xor %eax,%eax
835: e8 00 00 00 00 call 83a <efi_exit_boot_services+0x46>
83a: 4d 85 e4 test %r12,%r12
83d: 0f 85 25 02 00 00 jne a68 <efi_exit_boot_services+0x274>
843: 48 8d 3d 00 00 00 00 lea 0x0(%rip),%rdi # 84a <efi_exit_boot_services+0x56>
84a: 31 c0 xor %eax,%eax
84c: e8 00 00 00 00 call 851 <efi_exit_boot_services+0x5d>
851: 4c 89 fe mov %r15,%rsi
854: 48 89 df mov %rbx,%rdi
857: 41 ff d6 call *%r14
85a: 48 8d 3d 00 00 00 00 lea 0x0(%rip),%rdi # 861 <efi_exit_boot_services+0x6d>
861: 48 89 c6 mov %rax,%rsi
864: 49 89 c4 mov %rax,%r12
867: 31 c0 xor %eax,%eax
869: e8 00 00 00 00 call 86e <efi_exit_boot_services+0x7a>
86e: 40 8a 2d 00 00 00 00 mov 0x0(%rip),%bpl # 875 <efi_exit_boot_services+0x81>
875: 4d 85 e4 test %r12,%r12
878: 74 38 je 8b2 <efi_exit_boot_services+0xbe>
87a: 31 c0 xor %eax,%eax
87c: 48 8d 3d 00 00 00 00 lea 0x0(%rip),%rdi # 883 <efi_exit_boot_services+0x8f>
883: e8 00 00 00 00 call 888 <efi_exit_boot_services+0x94>
888: 40 84 ed test %bpl,%bpl
88b: 48 8b 15 00 00 00 00 mov 0x0(%rip),%rdx # 892 <efi_exit_boot_services+0x9e>
892: 48 8b 03 mov (%rbx),%rax
895: 0f 84 af 01 00 00 je a4a <efi_exit_boot_services+0x256>
89b: 48 8b 52 60 mov 0x60(%rdx),%rdx
89f: 48 83 ec 20 sub $0x20,%rsp
8a3: 48 8b 08 mov (%rax),%rcx
8a6: ff 52 48 call *0x48(%rdx)
8a9: 48 83 c4 20 add $0x20,%rsp
8ad: e9 a8 01 00 00 jmp a5a <efi_exit_boot_services+0x266>
8b2: 80 3d 00 00 00 00 00 cmpb $0x0,0x0(%rip) # 8b9 <efi_exit_boot_services+0xc5>
8b9: 74 13 je 8ce <efi_exit_boot_services+0xda>
8bb: 48 8d 3d 00 00 00 00 lea 0x0(%rip),%rdi # 8c2 <efi_exit_boot_services+0xce>
8c2: 31 c0 xor %eax,%eax
8c4: e8 00 00 00 00 call 8c9 <efi_exit_boot_services+0xd5>
8c9: e8 00 00 00 00 call 8ce <efi_exit_boot_services+0xda>
8ce: 31 c0 xor %eax,%eax
8d0: 48 8d 3d 00 00 00 00 lea 0x0(%rip),%rdi # 8d7 <efi_exit_boot_services+0xe3>
8d7: e8 00 00 00 00 call 8dc <efi_exit_boot_services+0xe8>
8dc: 40 84 ed test %bpl,%bpl
8df: 48 8b 43 20 mov 0x20(%rbx),%rax
8e3: 48 8b 15 00 00 00 00 mov 0x0(%rip),%rdx # 8ea <efi_exit_boot_services+0xf6>
8ea: 74 1d je 909 <efi_exit_boot_services+0x115>
8ec: 48 8b 72 60 mov 0x60(%rdx),%rsi
8f0: 48 83 ec 20 sub $0x20,%rsp
8f4: 48 8b 10 mov (%rax),%rdx
8f7: 4c 89 e9 mov %r13,%rcx
> 8fa: ff 96 e8 00 00 00 call *0xe8(%rsi)
900: 49 89 c4 mov %rax,%r12
903: 48 83 c4 20 add $0x20,%rsp
907: eb 19 jmp 922 <efi_exit_boot_services+0x12e>
909: 8b 52 3c mov 0x3c(%rdx),%edx
90c: 4c 89 ee mov %r13,%rsi
90f: 8b ba 80 00 00 00 mov 0x80(%rdx),%edi
915: 48 8b 10 mov (%rax),%rdx
918: 31 c0 xor %eax,%eax
91a: e8 00 00 00 00 call 91f <efi_exit_boot_services+0x12b>
91f: 49 89 c4 mov %rax,%r12
922: 4c 89 e6 mov %r12,%rsi
925: 48 8d 3d 00 00 00 00 lea 0x0(%rip),%rdi # 92c <efi_exit_boot_services+0x138>
92c: 31 c0 xor %eax,%eax
92e: e8 00 00 00 00 call 933 <efi_exit_boot_services+0x13f>
933: 48 b8 02 00 00 00 00 movabs $0x8000000000000002,%rax
93a: 00 00 80
93d: 49 39 c4 cmp %rax,%r12
940: 0f 85 22 01 00 00 jne a68 <efi_exit_boot_services+0x274>
946: 48 8b 53 28 mov 0x28(%rbx),%rdx
94a: 48 8b 43 08 mov 0x8(%rbx),%rax
94e: 40 84 ed test %bpl,%bpl
951: 48 8b 12 mov (%rdx),%rdx
954: 48 89 10 mov %rdx,(%rax)
957: 4c 8b 43 10 mov 0x10(%rbx),%r8
95b: 74 30 je 98d <efi_exit_boot_services+0x199>
95d: 48 8b 05 00 00 00 00 mov 0x0(%rip),%rax # 964 <efi_exit_boot_services+0x170>
964: 48 8b 13 mov (%rbx),%rdx
967: 56 push %rsi
:
968: 4d 89 c1 mov %r8,%r9
96b: 48 8b 4b 08 mov 0x8(%rbx),%rcx
96f: 4c 8b 43 20 mov 0x20(%rbx),%r8
973: 48 8b 40 60 mov 0x60(%rax),%rax
977: ff 73 18 push 0x18(%rbx)
97a: 48 8b 12 mov (%rdx),%rdx
97d: 48 83 ec 20 sub $0x20,%rsp
981: ff 50 38 call *0x38(%rax)
984: 49 89 c4 mov %rax,%r12
987: 48 83 c4 30 add $0x30,%rsp
98b: eb 38 jmp 9c5 <efi_exit_boot_services+0x1d1>
98d: 4c 8b 4b 18 mov 0x18(%rbx),%r9
991: 41 c7 40 04 00 00 00 movl $0x0,0x4(%r8)
998: 00
999: 48 8b 4b 20 mov 0x20(%rbx),%rcx
99d: c7 41 04 00 00 00 00 movl $0x0,0x4(%rcx)
9a4: 48 8b 15 00 00 00 00 mov 0x0(%rip),%rdx # 9ab <efi_exit_boot_services+0x1b7>
9ab: 48 8b 03 mov (%rbx),%rax
9ae: 48 8b 73 08 mov 0x8(%rbx),%rsi
9b2: 8b 52 3c mov 0x3c(%rdx),%edx
9b5: 8b 7a 28 mov 0x28(%rdx),%edi
9b8: 48 8b 10 mov (%rax),%rdx
9bb: 31 c0 xor %eax,%eax
9bd: e8 00 00 00 00 call 9c2 <efi_exit_boot_services+0x1ce>
9c2: 49 89 c4 mov %rax,%r12
9c5: 4d 85 e4 test %r12,%r12
9c8: 0f 85 9a 00 00 00 jne a68 <efi_exit_boot_services+0x274>
9ce: 4c 89 fe mov %r15,%rsi
9d1: 48 89 df mov %rbx,%rdi
9d4: 41 ff d6 call *%r14
9d7: 49 89 c4 mov %rax,%r12
9da: 48 85 c0 test %rax,%rax
9dd: 0f 85 85 00 00 00 jne a68 <efi_exit_boot_services+0x274>
9e3: 31 c0 xor %eax,%eax
9e5: 48 8d 3d 00 00 00 00 lea 0x0(%rip),%rdi # 9ec <efi_exit_boot_services+0x1f8>
9ec: e8 00 00 00 00 call 9f1 <efi_exit_boot_services+0x1fd>
9f1: 40 84 ed test %bpl,%bpl
9f4: 48 8b 43 20 mov 0x20(%rbx),%rax
9f8: 48 8b 15 00 00 00 00 mov 0x0(%rip),%rdx # 9ff <efi_exit_boot_services+0x20b>
9ff: 74 1d je a1e <efi_exit_boot_services+0x22a>
a01: 48 8b 72 60 mov 0x60(%rdx),%rsi
a05: 48 83 ec 20 sub $0x20,%rsp
a09: 48 8b 10 mov (%rax),%rdx
a0c: 4c 89 e9 mov %r13,%rcx
a0f: ff 96 e8 00 00 00 call *0xe8(%rsi)
a15: 49 89 c4 mov %rax,%r12
a18: 48 83 c4 20 add $0x20,%rsp
a1c: eb 19 jmp a37 <efi_exit_boot_services+0x243>
a1e: 8b 52 3c mov 0x3c(%rdx),%edx
a21: 4c 89 ee mov %r13,%rsi
a24: 8b ba 80 00 00 00 mov 0x80(%rdx),%edi
a2a: 48 8b 10 mov (%rax),%rdx
a2d: 31 c0 xor %eax,%eax
a2f: e8 00 00 00 00 call a34 <efi_exit_boot_services+0x240>
a34: 49 89 c4 mov %rax,%r12
a37: 4c 89 e6 mov %r12,%rsi
a3a: 48 8d 3d 00 00 00 00 lea 0x0(%rip),%rdi # a41 <efi_exit_boot_services+0x24d>
a41: 31 c0 xor %eax,%eax
a43: e8 00 00 00 00 call a48 <efi_exit_boot_services+0x254>
a48: eb 1e jmp a68 <efi_exit_boot_services+0x274>
a4a: 8b 52 3c mov 0x3c(%rdx),%edx
a4d: 48 8b 30 mov (%rax),%rsi
a50: 31 c0 xor %eax,%eax
a52: 8b 7a 30 mov 0x30(%rdx),%edi
a55: e8 00 00 00 00 call a5a <efi_exit_boot_services+0x266>
a5a: 48 8d 3d 00 00 00 00 lea 0x0(%rip),%rdi # a61 <efi_exit_boot_services+0x26d>
a61: 31 c0 xor %eax,%eax
a63: e8 00 00 00 00 call a68 <efi_exit_boot_services+0x274>
a68: 5a pop %rdx
a69: 4c 89 e0 mov %r12,%rax
a6c: 5b pop %rbx
a6d: 5d pop %rbp
a6e: 41 5c pop %r12
a70: 41 5d pop %r13
a72: 41 5e pop %r14
a74: 41 5f pop %r15
a76: c3 ret
@sforshee and I were able to make a bit more progress. We found that the crash is occuring because the ExitBootServices()
callback in the Boot Services Table is NULL. It's not NULL at boot, so we added a watchpoint to see what was clearing it. Turns out it is shim
(15.4-0ubuntu2)
:
(gdb) bt
#0 unhook_system_services () at replacements.c:49
#1 0x000000007cb8912b in load_image (BootPolicy=<optimized out>, ParentImageHandle=<optimized out>, DevicePath=<optimized out>,
SourceBuffer=<optimized out>, SourceSize=0, ImageHandle=0x7ef5d930) at replacements.c:59
#2 0x000000007cd00260 in ?? ()
#3 0x000000007ccf9518 in ?? ()
#4 0x000000007d303b18 in ?? ()
#5 0x000000007ccf4498 in ?? ()
#6 0x0000000000000000 in ?? ()
(gdb)
@sforshee suggested the following patch to shim:
diff --git a/replacements.c b/replacements.c
index 278a8e7848e4..072da428da46 100644
--- a/replacements.c
+++ b/replacements.c
@@ -33,7 +33,9 @@ get_active_systab(void)
static typeof(systab->BootServices->LoadImage) system_load_image;
static typeof(systab->BootServices->StartImage) system_start_image;
static typeof(systab->BootServices->Exit) system_exit;
+#if !defined(DISABLE_EBS_PROTECTION)
static typeof(systab->BootServices->ExitBootServices) system_exit_boot_services;
+#endif
static EFI_HANDLE last_loaded_image;
@@ -45,7 +47,9 @@ unhook_system_services(void)
systab->BootServices->LoadImage = system_load_image;
systab->BootServices->StartImage = system_start_image;
+#if !defined(DISABLE_EBS_PROTECTION)
systab->BootServices->ExitBootServices = system_exit_boot_services;
+#endif
gBS = systab->BootServices;
}
This does prevent shim from clearing ExitBootServices
- now that watchpoint doesn't get tripped until OVMF
runs InternalMemZeroMem()
, which I suppose is expected. However, we still don't seem to be surviving the call to exit_boot_services
. Now we're crashing in a new different way:
EFI stub: Actually calling exit_boot_services
!!!! X64 Exception Type - 0E(#PF - Page-Fault) CPU Apic ID - 00000000 !!!!
ExceptionData - 0000000000000000 I:0 R:0 U:0 W:0 P:0 PK:0 SS:0 SGX:0
RIP - 000000000000FF91, CS - 0000000000000038, RFLAGS - 0000000000200003
RAX - 000000000000F055, RCX - 0000000000000000, RDX - 00000000000000EB
RBX - 000000003DA49AEB, RSP - 000000007EF64EB8, RBP - 000000007EF64F00
RSI - 000000000000000A, RDI - 0000000000000000
R8 - 000000008FE6ED20, R9 - 0000000000000000, R10 - 000000007DA01508
R11 - 0000000000000001, R12 - 000000003DA49D03, R13 - 0000000000000000
R14 - 000000003DA4385C, R15 - 000000007EF65270
DS - 0000000000000030, ES - 0000000000000030, FS - 0000000000000030
GS - 0000000000000030, SS - 0000000000000030
CR0 - 0000000080010033, CR2 - FFFFFFFFFFFFFFFF, CR3 - 000000007EC01000
CR4 - 0000000000000668, CR8 - 0000000000000000
DR0 - 0000000000000000, DR1 - 0000000000000000, DR2 - 0000000000000000
DR3 - 0000000000000000, DR6 - 00000000FFFF0FF0, DR7 - 0000000000000400
GDTR - 000000007E9E3000 0000000000000047, LDTR - 0000000000000000
IDTR - 000000007E163018 0000000000000FFF, TR - 0000000000000000
FXSAVE_STATE - 000000007EF64B10
!!!! Can't find image information. !!!!
Turns out that was my fault - I had sprinkled efi_info()
calls throughout, not realizing that facility goes away with ExitBootServices
. Using a non-instrumented Ubuntu kernel, I can now get to a prompt in KVM mode. In QEMU emulation mode it fails w/ complains about BIOS corruption - but that may be unrelated.
PR for shim: https://github.com/rhboot/shim/pull/378
Closing this one on our side as we've now confirmed the issue is coming from the shim and not from anything LXD is doing (other than us using UEFI+SB by default).
Required information
Issue description
Machines configured for use with MAAS always boot over the network. When an operating system is deployed MAAS provides a grub.cfg which finds the local bootloader and chainloads it. Previously this caused compatibility issues which caused the chain of trust to be broken. These issues have been fixed in Ubuntu and upstream.
While I have been able to verify SecureBoot is now working with MAAS on physical hardware and libvirt VMs it is still failing with LXD. VMs in LXD either hang or drop to the GRUB shell. We believe this may be due to LXD passing the
-no-reboot
option to qemu.Steps to reproduce
/var/lib/maas/boot-resources/current
or/var/snap/maas/common/maas/boot-resources/current
replacebootx64.efi
with the latest SHIM from shim-signed_1.47+15.4-0ubuntu2_amd64.deb and replacegrubx64.efi
with the latest GRUB from grub-efi-amd64-signed_1.169+2.04-1ubuntu45_amd64.deb"security.secureboot": "true"
Information to attach
LXD qemu command:
libvirt qemu command: