AMDESE / AMDSEV

AMD Secure Encrypted Virtualization
285 stars 84 forks source link

SEV-SNP: kvm run failed Invalid argument #138

Open TheNetAdmin opened 1 year ago

TheNetAdmin commented 1 year ago

When I try to launch an SEV-SNP VM I got the following error during the guest boot, and I'm wondering what might went wrong in my setup?

error: kvm run failed Invalid argument
EAX=00000000 EBX=00000000 ECX=00000000 EDX=00000000
ESI=00000000 EDI=00000000 EBP=00000000 ESP=00000000
EIP=00000000 EFL=00000002 [-------] CPL=0 II=0 A20=1 SMM=0 HLT=0
ES =0000 00000000 00000000 00000000
CS =0000 00000000 00000000 00000000
SS =0000 00000000 00000000 00000000
DS =0000 00000000 00000000 00000000
FS =0000 00000000 00000000 00000000
GS =0000 00000000 00000000 00000000
LDT=0000 00000000 00000000 00000000
TR =0000 00000000 00000000 00000000
GDT=     00000000 00000000
IDT=     00000000 00000000
CR0=80010033 CR2=00000000 CR3=00000000 CR4=00000668
DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000 
DR6=00000000ffff0ff0 DR7=0000000000000400
EFER=0000000000000900
Code=<??> ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ??

And while QEMU reports this error, the host Linux reports

$ dmesg

[  198.693401] kvm [1308]: vcpu0, guest rIP: 0x0 vmgexit: unsupported event - exit_info_1=0x404, exit_info_2=0x0

I used the latest sev-snp commit a480a514798b9d226b5aca9f96ce47c9458f504c

My hardware is

And the host Linux system is set up as

$ uname -a
Linux 5.19.0-rc6-snp-host-d9bd54fea4d2 #1 SMP Thu Jan 26 13:43:32 PST 2023 x86_64 GNU/Linux

$ dmesg | grep -i sev
[    0.656071] SEV-SNP: RMP table physical address 0x0000000054000000 - 0x00000000747fffff
[    2.536187] ccp 0000:42:00.1: sev enabled
[    2.598793] ccp 0000:42:00.1: SEV firmware update successful
[    4.378148] ccp 0000:42:00.1: SEV API:1.51 build:3
[    4.378157] ccp 0000:42:00.1: SEV-SNP API:1.51 build:3
[    4.409337] SEV supported: 410 ASIDs
[    4.409337] SEV-ES and SEV-SNP supported: 99 ASIDs

$ cpuid -1 -l 0x8000001f
CPU:
   AMD Secure Encryption (0x8000001f):
      SME: secure memory encryption support    = true
      SEV: secure encrypted virtualize support = true
      VM page flush MSR support                = true
      SEV-ES: SEV encrypted state support      = true
      SEV-SNP: SEV secure nested paging        = true
      VMPL: VM permission levels               = true
      hardware cache coher across enc domains  = true
      SEV guest exec only from 64-bit host     = true
      restricted injection                     = true
      alternate injection                      = true
      full debug state swap for SEV-ES guests  = true
      disallowing IBS use by host              = true
      encryption bit position in PTE           = 0x33 (51)
      physical address space width reduction   = 0x5 (5)
      number of VM permission levels           = 0x4 (4)
      number of SEV-enabled guests supported   = 0x1fd (509)
      minimum SEV guest ASID                   = 0x64 (100)

And the full log of the QEMU output

$ sudo ./launch-qemu.sh -hda ../vm-images/ubuntu-uefi.qcow2 -sev-snp -kernel linux/guest/arch/x86/boot/bzImage

32+0 records in
1+0 records out
512 bytes copied, 0.00021422 s, 2.4 MB/s
/home/netadmin/code/AMDSEV/usr/local/bin/qemu-system-x86_64 -enable-kvm -cpu EPYC-v4 -machine q35 -smp 8,maxcpus=64 -m 16384M,slots=5,maxmem=30G -no-reboot -drive if=pflash,format=raw,unit=0,file=/home/netadmin/code/AMDSEV/usr/local/share/qemu/OVMF_CODE.fd,readonly -drive if=pflash,format=raw,unit=1,file=/home/netadmin/code/AMDSEV/ubuntu-uefi.fd -netdev user,id=vmnic  -device virtio-net-pci,disable-legacy=on,iommu_platform=true,netdev=vmnic,romfile= -drive file=/home/netadmin/code/vm-images/ubuntu-uefi.qcow2,if=none,id=disk0,format=qcow2 -device virtio-scsi-pci,id=scsi0,disable-legacy=on,iommu_platform=true -device scsi-hd,drive=disk0 -machine memory-encryption=sev0,vmport=off -object sev-snp-guest,id=sev0,cbitpos=51,reduced-phys-bits=1 -kernel linux/guest/arch/x86/boot/bzImage -append "console=ttyS0 earlyprintk=serial root=/dev/sda2" -nographic -monitor pty -monitor unix:monitor,server,nowait 
Mapping CTRL-C to CTRL-]
Launching VM ...
  /tmp/cmdline.1282
qemu-system-x86_64: -drive if=pflash,format=raw,unit=0,file=/home/netadmin/code/AMDSEV/usr/local/share/qemu/OVMF_CODE.fd,readonly: warning: short-form boolean option 'readonly' deprecated
Please use readonly=on instead
char device redirected to /dev/pts/2 (label compat_monitor0)
qemu-system-x86_64: warning: Number of hotpluggable cpus requested (64) exceeds the recommended cpus supported by KVM (48)
SecCoreStartupWithStack(0xFFFCC000, 0x820000)
Register PPI Notify: DCD0BE23-9586-40F4-B643-06522CED4EDE
Install PPI: 8C8CE578-8A3D-4F1C-9935-896185C32DD3
Install PPI: 5473C07A-3DCB-4DCA-BD6F-1E9689E7349A
The 0th FV start address is 0x00000820000, size is 0x000E0000, handle is 0x820000
Register PPI Notify: 49EDB1C1-BF21-4761-BB12-EB0031AABB39
Register PPI Notify: EA7CA24B-DED5-4DAD-A389-BF827E8F9B38
Install PPI: B9E0ABFE-5979-4914-977F-6DEE78C278A6
Install PPI: DBE23AA9-A345-4B97-85B6-B226F1617389
Install PPI: 138F9CF4-F0E7-4721-8F49-F5FFECF42D40
DiscoverPeimsAndOrderWithApriori(): Found 0x8 PEI FFS files in the 0th FV
Loading PEIM 9B3ADA4F-AE56-4C24-8DEA-F03B7558AE50
Loading PEIM at 0x0000082BDC0 EntryPoint=0x0000082F097 PcdPeim.efi
Install PPI: 06E81C58-4AD7-44BC-8390-F10265F72480
Install PPI: 01F34D25-4DE2-23AD-3FF3-36353FF323F1
Install PPI: 4D8B155B-C059-4C8F-8926-06FD4331DB8A
Install PPI: A60C6B59-E459-425D-9C69-0BCC9CB27D81
Register PPI Notify: 605EA650-C65C-42E1-BA80-91A52AB618C6
Loading PEIM A3610442-E69F-4DF3-82CA-2360C4031A23
Loading PEIM at 0x00000830CC0 EntryPoint=0x00000832191 ReportStatusCodeRouterPei.efi
Install PPI: 0065D394-9951-4144-82A3-0AFC8579C251
Install PPI: 229832D3-7A30-4B36-B827-F40CB7D45436
Loading PEIM 9D225237-FA01-464C-A949-BAABC02D31D0
Loading PEIM at 0x00000832E40 EntryPoint=0x000008341E0 StatusCodeHandlerPei.efi
Loading PEIM 222C386D-5ABC-4FB4-B124-FBB82488ACF4
Loading PEIM at 0x00000834FC0 EntryPoint=0x0000083C83E PlatformPei.efi
Platform PEIM Loaded
CMOS:
00: 32 00 32 00 21 00 02 27 02 23 26 02 00 80 00 00
10: 00 00 00 00 06 80 02 FF FF 00 00 00 00 00 00 00
20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
30: FF FF 20 00 00 7F 00 20 30 00 00 00 00 12 00 00
40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
50: 00 00 00 00 00 00 00 00 00 00 00 00 80 03 00 07
60: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
QemuFwCfgProbe: Supported 1, DMA 0
Select Item: 0x19
Select Item: 0x2C
S3 support was detected on QEMU
Install PPI: 7408D748-FC8C-4EE6-9288-C4BEC092A410
Select Item: 0x19
Select Item: 0x19
Select Item: 0x25
PlatformGetFirstNonAddressCB: FirstNonAddress=0x480000000
Select Item: 0x19
Select Item: 0x19
Select Item: 0x26
PlatformAddressWidthFromCpuid: Signature: 'AuthenticAMD', PhysBits: 40, QemuQuirk: On, Valid: Yes
PlatformDynamicMmioWindow: using dynamic mmio window
PlatformDynamicMmioWindow:   Addr Space 0x10000000000 (1024 GB)
PlatformDynamicMmioWindow:   MMIO Space 0x2000000000 (128 GB)
Select Item: 0x19
Select Item: 0x25
PlatformDynamicMmioWindow:   Pci64 Base 0xE000000000
PlatformDynamicMmioWindow:   Pci64 Size 0x2000000000
AddressWidthInitialization: Pci64Base=0xE000000000 Pci64Size=0x2000000000
Select Item: 0x5
PlatformMaxCpuCountInitialization: BootCpuCount=8 MaxCpuCount=64
Select Item: 0x19
Select Item: 0x25
PlatformGetLowMemoryCB: LowMemory=0x80000000
PublishPeiMemory: PhysMemAddressWidth=40 PeiMemoryCap=65548 KB
PeiInstallPeiMemory MemoryBegin 0x7BD7D000, MemoryLength 0x4003000
Select Item: 0x19
Select Item: 0x25
PlatformQemuInitializeRam called
Select Item: 0x19
Select Item: 0x25
Select Item: 0x19
Select Item: 0x25
PlatformAddHobCB: Reserved [0xFEFFC000, 0xFF000000)
PlatformAddHobCB: HighMemory [0x100000000, 0x480000000)
Reserved variable store memory: 0x7FCFC000; size: 528kb
Platform PEI Firmware Volume Initialization
Install PPI: 49EDB1C1-BF21-4761-BB12-EB0031AABB39
Notify: PPI Guid: 49EDB1C1-BF21-4761-BB12-EB0031AABB39, Peim notify entry point: 824FC3
The 1th FV start address is 0x00000900000, size is 0x00D00000, handle is 0x900000
Select Item: 0x19
Select Item: 0x25
Select Item: 0x19
Register PPI Notify: EE16160A-E8BE-47A6-820A-C6900DB0250A
SEV is enabled (mask 0x8000000000000)
SEV-ES is enabled, 128 GHCB pages allocated starting at 0x7FC7C000
SEV-ES is enabled, 64 GHCB backup pages allocated starting at 0x7F9C0000
Select Item: 0x19
Temp Stack : BaseAddress=0x818000 Length=0x8000
Temp Heap  : BaseAddress=0x810000 Length=0x8000
Total temporary memory:    65536 bytes.
  temporary memory stack ever used:       31624 bytes.
  temporary memory heap used for HobList: 7800 bytes.
  temporary memory heap occupied by memory pages: 0 bytes.
Memory Allocation 0x00000000 0x80D000 - 0x80DFFF
Memory Allocation 0x00000000 0x80E000 - 0x80EFFF
Memory Allocation 0x0000000A 0x7FD80000 - 0x7FFFFFFF
Memory Allocation 0x0000000A 0x810000 - 0x81FFFF
Memory Allocation 0x0000000A 0x807000 - 0x807FFF
Memory Allocation 0x0000000A 0x800000 - 0x805FFF
Memory Allocation 0x0000000A 0x808000 - 0x808FFF
Memory Allocation 0x0000000A 0x809000 - 0x80AFFF
Memory Allocation 0x0000000A 0x80C000 - 0x80CFFF
Memory Allocation 0x0000000A 0x806000 - 0x806FFF
Memory Allocation 0x0000000A 0x80B000 - 0x80BFFF
Memory Allocation 0x00000006 0x7FCFC000 - 0x7FD7FFFF
Memory Allocation 0x0000000A 0x820000 - 0x8FFFFF
Memory Allocation 0x00000004 0x900000 - 0x15FFFFF
Memory Allocation 0x00000000 0xB0000000 - 0xBFFFFFFF
Memory Allocation 0x00000000 0x7FC7C000 - 0x7FCFBFFF
Memory Allocation 0x00000004 0x7FA00000 - 0x7FBFFFFF
Memory Allocation 0x00000007 0x7FC00000 - 0x7FC7BFFF
Memory Allocation 0x00000004 0x7F9C0000 - 0x7F9FFFFF
Memory Allocation 0x00000004 0x7F9BF000 - 0x7F9BFFFF
Old Stack size 32768, New stack size 131072
Stack Hob: BaseAddress=0x7BD7D000 Length=0x20000
Heap Offset = 0x7B58D000 Stack Offset = 0x7B57D000
TemporaryRamMigration(0x810000, 0x7BD95000, 0x10000)
Loading PEIM 52C05B14-0B98-496C-BC3B-04B50211D680
Loading PEIM at 0x0007F9B3000 EntryPoint=0x0007F9BB19F PeiCore.efi
Reinstall PPI: 8C8CE578-8A3D-4F1C-9935-896185C32DD3
Reinstall PPI: 5473C07A-3DCB-4DCA-BD6F-1E9689E7349A
Reinstall PPI: B9E0ABFE-5979-4914-977F-6DEE78C278A6
Install PPI: F894643D-C449-42D1-8EA8-85BDD8C65BDE
Loading PEIM 9B3ADA4F-AE56-4C24-8DEA-F03B7558AE50
Loading PEIM at 0x0007F9AE000 EntryPoint=0x0007F9B12D7 PcdPeim.efi
Reinstall PPI: 06E81C58-4AD7-44BC-8390-F10265F72480
Reinstall PPI: 4D8B155B-C059-4C8F-8926-06FD4331DB8A
Reinstall PPI: 01F34D25-4DE2-23AD-3FF3-36353FF323F1
Reinstall PPI: A60C6B59-E459-425D-9C69-0BCC9CB27D81
Loading PEIM 86D70125-BAA3-4296-A62F-602BEBBB9081
Loading PEIM at 0x0007F9A9000 EntryPoint=0x0007F9AC355 DxeIpl.efi
Install PPI: 1A36E4E7-FAB6-476A-8E75-695A0576FDD7
Install PPI: 0AE8CE5D-E448-4437-A8D7-EBF5F194F731
Loading PEIM 89E549B0-7CFE-449D-9BA3-10D8B2312D71
Loading PEIM at 0x0007F9A4000 EntryPoint=0x0007F9A6B93 S3Resume2Pei.efi
Install PPI: 6D582DBC-DB85-4514-8FCC-5ADF6227B147
Loading PEIM EDADEB9D-DDBA-48BD-9D22-C1C169C8C5C6
Loading PEIM at 0x0007F994000 EntryPoint=0x0007F998F8B CpuMpPei.efi
Register PPI Notify: F894643D-C449-42D1-8EA8-85BDD8C65BDE
Notify: PPI Guid: F894643D-C449-42D1-8EA8-85BDD8C65BDE, Peim notify entry point: 7F99B111
AP Loop Mode is 1
AP Vector: non-16-bit = 7F786000/32A
WakeupBufferStart = 9F000, WakeupBufferSize = 1000
AP Vector: 16-bit = 9F000/39, ExchangeInfo = 9F039/A4
CpuMpPei: 5-Level Paging = 0
APIC MODE is 1
MpInitLib: Find 8 processors in system.
GetMicrocodePatchInfoFromHob: Microcode patch cache HOB is not found.
CpuMpPei: 5-Level Paging = 0
CPU[0000]: Microcode revision = 00000000, expected = 00000000
CPU[0001]: Microcode revision = 00000000, expected = 00000000
CPU[0002]: Microcode revision = 00000000, expected = 00000000
CPU[0003]: Microcode revision = 00000000, expected = 00000000
CPU[0004]: Microcode revision = 00000000, expected = 00000000
CPU[0005]: Microcode revision = 00000000, expected = 00000000
CPU[0006]: Microcode revision = 00000000, expected = 00000000
CPU[0007]: Microcode revision = 00000000, expected = 00000000
Register PPI Notify: 8F9D4825-797D-48FC-8471-845025792EF6
Does not find any stored CPU BIST information from PPI!
  APICID - 0x00000000, BIST - 0x00000000
  APICID - 0x00000001, BIST - 0x00000000
  APICID - 0x00000002, BIST - 0x00000000
  APICID - 0x00000003, BIST - 0x00000000
  APICID - 0x00000004, BIST - 0x00000000
  APICID - 0x00000005, BIST - 0x00000000
  APICID - 0x00000006, BIST - 0x00000000
  APICID - 0x00000007, BIST - 0x00000000
Install PPI: 9E9F374B-8F16-4230-9824-5846EE766A97
Install PPI: 5CB9CB3D-31A4-480C-9498-29D269BACFBA
Install PPI: EE16160A-E8BE-47A6-820A-C6900DB0250A
Notify: PPI Guid: EE16160A-E8BE-47A6-820A-C6900DB0250A, Peim notify entry point: 836FE5
PlatformPei: ClearCacheOnMpServicesAvailable
CpuMpPei: 5-Level Paging = 0
DiscoverPeimsAndOrderWithApriori(): Found 0x0 PEI FFS files in the 1th FV
DXE IPL Entry
Loading PEIM D6A2CB7F-6A18-4E2F-B43B-9920A733700A
Loading PEIM at 0x0007F749000 EntryPoint=0x0007F760C72 DxeCore.efi
Loading DXE CORE at 0x0007F749000 EntryPoint=0x0007F760C72
AddressBits=40 5LevelPaging=0 1GPage=1
Pml5=1 Pml4=2 Pdp=512 TotalPage=3
Install PPI: 605EA650-C65C-42E1-BA80-91A52AB618C6
Notify: PPI Guid: 605EA650-C65C-42E1-BA80-91A52AB618C6, Peim notify entry point: 82DC98
CoreInitializeMemoryServices:
  BaseAddress - 0x7BDA1000 Length - 0x365F000 MinimalMemorySizeNeeded - 0x322000
InstallProtocolInterface: 5B1B31A1-9562-11D2-8E3F-00A0C969723B 7F76ECA8
ProtectUefiImageCommon - 0x7F76ECA8
  - 0x000000007F749000 - 0x000000000002F000
DxeMain: MemoryBaseAddress=0x7BDA1000 MemoryLength=0x365F000
HOBLIST address in DXE = 0x7F0E7018
Memory Allocation 0x00000000 0x80D000 - 0x80DFFF
Memory Allocation 0x00000000 0x80E000 - 0x80EFFF
Memory Allocation 0x0000000A 0x7FD80000 - 0x7FFFFFFF
Memory Allocation 0x0000000A 0x810000 - 0x81FFFF
Memory Allocation 0x0000000A 0x807000 - 0x807FFF
Memory Allocation 0x0000000A 0x800000 - 0x805FFF
Memory Allocation 0x0000000A 0x808000 - 0x808FFF
Memory Allocation 0x0000000A 0x809000 - 0x80AFFF
Memory Allocation 0x0000000A 0x80C000 - 0x80CFFF
Memory Allocation 0x0000000A 0x806000 - 0x806FFF
Memory Allocation 0x0000000A 0x80B000 - 0x80BFFF
Memory Allocation 0x00000006 0x7FCFC000 - 0x7FD7FFFF
Memory Allocation 0x0000000A 0x820000 - 0x8FFFFF
Memory Allocation 0x00000004 0x900000 - 0x15FFFFF
Memory Allocation 0x00000000 0xB0000000 - 0xBFFFFFFF
Memory Allocation 0x00000000 0x7FC7C000 - 0x7FCFBFFF
Memory Allocation 0x00000004 0x7FA00000 - 0x7FBFFFFF
Memory Allocation 0x00000007 0x7FC00000 - 0x7FC7BFFF
Memory Allocation 0x00000004 0x7F9C0000 - 0x7F9FFFFF
Memory Allocation 0x00000004 0x7F9BF000 - 0x7F9BFFFF
Memory Allocation 0x00000004 0x7F729000 - 0x7F748FFF
Memory Allocation 0x00000003 0x7F9B3000 - 0x7F9BEFFF
Memory Allocation 0x00000003 0x7F9AE000 - 0x7F9B2FFF
Memory Allocation 0x00000003 0x7F9A9000 - 0x7F9ADFFF
Memory Allocation 0x00000003 0x7F9A4000 - 0x7F9A8FFF
Memory Allocation 0x00000003 0x7F994000 - 0x7F9A3FFF
Memory Allocation 0x00000004 0x7F787000 - 0x7F993FFF
Memory Allocation 0x00000003 0x7F786000 - 0x7F786FFF
Memory Allocation 0x00000007 0x7F785000 - 0x7F785FFF
Memory Allocation 0x00000007 0x7F784000 - 0x7F784FFF
Memory Allocation 0x00000007 0x7F783000 - 0x7F783FFF
Memory Allocation 0x00000007 0x7F782000 - 0x7F782FFF
Memory Allocation 0x00000007 0x7F781000 - 0x7F781FFF
Memory Allocation 0x00000007 0x7F780000 - 0x7F780FFF
Memory Allocation 0x00000007 0x7F77F000 - 0x7F77FFFF
Memory Allocation 0x00000000 0x7F77E000 - 0x7F77EFFF
Memory Allocation 0x00000000 0x7F77D000 - 0x7F77DFFF
Memory Allocation 0x00000000 0x7F77C000 - 0x7F77CFFF
Memory Allocation 0x00000000 0x7F77B000 - 0x7F77BFFF
Memory Allocation 0x00000000 0x7F77A000 - 0x7F77AFFF
Memory Allocation 0x00000000 0x7F779000 - 0x7F779FFF
Memory Allocation 0x00000000 0x7F778000 - 0x7F778FFF
Memory Allocation 0x00000003 0x7F749000 - 0x7F777FFF
Memory Allocation 0x00000003 0x7F749000 - 0x7F777FFF
Memory Allocation 0x00000004 0x7F729000 - 0x7F748FFF
Memory Allocation 0x00000004 0x7F400000 - 0x7F5FFFFF
Memory Allocation 0x00000007 0x7F600000 - 0x7F728FFF
Memory Allocation 0x00000004 0x7BD7D000 - 0x7BD9CFFF
FV Hob            0x900000 - 0x15FFFFF
InstallProtocolInterface: D8117CFE-94A6-11D4-9A3A-0090273FC14D 7F770000
InstallProtocolInterface: 8F644FA9-E850-4DB1-9CE2-0B44698E8DA4 7F0E37B0
InstallProtocolInterface: 09576E91-6D3F-11D2-8E39-00A0C969723B 7F0E3A98
InstallProtocolInterface: 220E73B6-6BDB-4413-8405-B974B108619A 7F0E3230
InstallProtocolInterface: EE4E5898-3914-4259-9D6E-DC7BD79403CF 7F76FF18
Loading driver 9B680FCE-AD6B-4F3A-B60B-F59899003443
InstallProtocolInterface: 5B1B31A1-9562-11D2-8E3F-00A0C969723B 7ED700C0
Loading driver at 0x0007ED58000 EntryPoint=0x0007ED60042 DevicePathDxe.efi
InstallProtocolInterface: BC62157E-3E33-4FEC-9920-2D3B36D750DF 7ED70A98
ProtectUefiImageCommon - 0x7ED700C0
  - 0x000000007ED58000 - 0x000000000000B600
InstallProtocolInterface: 0379BE4E-D706-437D-B037-EDB82FB772A4 7ED62900
InstallProtocolInterface: 8B843E20-8132-4852-90CC-551A4E4A7F1C 7ED628E0
InstallProtocolInterface: 05C99A21-C70F-4AD2-8A5F-35DF3343F51E 7ED628C0
Loading driver 80CF7257-87AB-47F9-A3FE-D50B76D89541
InstallProtocolInterface: 5B1B31A1-9562-11D2-8E3F-00A0C969723B 7ED69040
Loading driver at 0x0007ED52000 EntryPoint=0x0007ED55A94 PcdDxe.efi
InstallProtocolInterface: BC62157E-3E33-4FEC-9920-2D3B36D750DF 7ED70618
ProtectUefiImageCommon - 0x7ED69040
  - 0x000000007ED52000 - 0x0000000000005DC0
InstallProtocolInterface: 11B34006-D85B-4D0A-A290-D5A571310EF7 7ED57A80
InstallProtocolInterface: 13A3F0F6-264A-3EF0-F2E0-DEC512342F34 7ED579E0
InstallProtocolInterface: 5BE40F57-FA68-4610-BBBF-E9C5FCDAD365 7ED579B0
InstallProtocolInterface: FD0F4478-0EFD-461D-BA2D-E58C45FD5F5E 7ED57990
Loading driver 2EC9DA37-EE35-4DE9-86C5-6D9A81DC38A7
InstallProtocolInterface: 5B1B31A1-9562-11D2-8E3F-00A0C969723B 7ED6FC40
Loading driver at 0x0007ED64000 EntryPoint=0x0007ED665A2 AmdSevDxe.efi
InstallProtocolInterface: BC62157E-3E33-4FEC-9920-2D3B36D750DF 7ED6FA98
ProtectUefiImageCommon - 0x7ED6FC40
  - 0x000000007ED64000 - 0x0000000000004280
InstallProtocolInterface: 38C74800-5590-4DB4-A0F3-675D9B8E8026 7ED680C0
Loading driver E750224E-7BCE-40AF-B5BB-47E3611EB5C2
InstallProtocolInterface: 5B1B31A1-9562-11D2-8E3F-00A0C969723B 7ED6F140
Loading driver at 0x0007ED4D000 EntryPoint=0x0007ED4F26E TdxDxe.efi
InstallProtocolInterface: BC62157E-3E33-4FEC-9920-2D3B36D750DF 7ED6F818
ProtectUefiImageCommon - 0x7ED6F140
  - 0x000000007ED4D000 - 0x0000000000004D40
InstallProtocolInterface: BB00A5CA-08CE-462F-A537-43C74A825CA4 0
Loading driver 733CBAC2-B23F-4B92-BC8E-FB01CE5907B7
InstallProtocolInterface: 5B1B31A1-9562-11D2-8E3F-00A0C969723B 7ED6ECC0
Loading driver at 0x0007F2E4000 EntryPoint=0x0007F2E7180 FvbServicesRuntimeDxe.efi
InstallProtocolInterface: BC62157E-3E33-4FEC-9920-2D3B36D750DF 7ED6E898
ProtectUefiImageCommon - 0x7ED6ECC0
  - 0x000000007F2E4000 - 0x0000000000009000
error: kvm run failed Invalid argument
EAX=00000000 EBX=00000000 ECX=00000000 EDX=00000000
ESI=00000000 EDI=00000000 EBP=00000000 ESP=00000000
EIP=00000000 EFL=00000002 [-------] CPL=0 II=0 A20=1 SMM=0 HLT=0
ES =0000 00000000 00000000 00000000
CS =0000 00000000 00000000 00000000
SS =0000 00000000 00000000 00000000
DS =0000 00000000 00000000 00000000
FS =0000 00000000 00000000 00000000
GS =0000 00000000 00000000 00000000
LDT=0000 00000000 00000000 00000000
TR =0000 00000000 00000000 00000000
GDT=     00000000 00000000
IDT=     00000000 00000000
CR0=80010033 CR2=00000000 CR3=00000000 CR4=00000668
DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000 
DR6=00000000ffff0ff0 DR7=0000000000000400
EFER=0000000000000900
Code=<??> ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ??
tlendacky commented 1 year ago

The unsupported event was reported by the guest to the hypervisor. A 0x404 error code indicates that the page was not validated. Are you using the AMDSEV sev-snp-devel branch script to build OVMF or are you building it on your own? If on your own, ensure that you are not building with SMM or Secure Boot enabled.

TheNetAdmin commented 1 year ago

Hi I used the sev-snp-devel branch. I also tried another pre-built edk2 ovmf from https://github.com/retrage/edk2-nightly which was working on other SEV-SNP platforms but gives the same 0x404 error on my platform.

TheNetAdmin commented 1 year ago

And I notice the following command on host doesn't give any output

$ sudo dmesg | grep -i rmp
(empty output)

Maybe due to some reason the RMP table was not set up on the host?

And another thing I noticed is that although the mother board (Gigabyte MZ72 HB0) supports 8 DRAM DIMMs, I only inserted four (should be one per channel according to the motherboard manual).

Isthis another possible reason that RMP is not set up?

tlendacky commented 1 year ago

$ dmesg | grep -i sev [ 0.656071] SEV-SNP: RMP table physical address 0x0000000054000000 - 0x00000000747fffff [ 2.536187] ccp 0000:42:00.1: sev enabled [ 2.598793] ccp 0000:42:00.1: SEV firmware update successful [ 4.378148] ccp 0000:42:00.1: SEV API:1.51 build:3 [ 4.378157] ccp 0000:42:00.1: SEV-SNP API:1.51 build:3 [ 4.409337] SEV supported: 410 ASIDs [ 4.409337] SEV-ES and SEV-SNP supported: 99 ASIDs

Hmmm... this is from your earlier comment. Are you sure that your dmesg just hasn't been cleared? Otherwise, you wouldn't even be able to start launching an SNP guest.

The number of DIMMs installed does not affect whether the RMP will be allocated or not.

If you're comfortable with modifying OVMF code, you can try adding some debug statements to QemuFlashInitialize() and QemuFlashDetected() in OvmfPkg/QemuFlashFvbServicesRuntimeDxe/QemuFlash.c. Just use a simple: DEBUG ((DEBUG_INFO, "*** DEBUG: %a:%u\n", func, LINE)); to track progress and see where the error occurs (assuming it reaches QemuFlashInitialize()).

TheNetAdmin commented 1 year ago

$ dmesg | grep -i sev [ 0.656071] SEV-SNP: RMP table physical address 0x0000000054000000 - 0x00000000747fffff [ 2.536187] ccp 0000:42:00.1: sev enabled [ 2.598793] ccp 0000:42:00.1: SEV firmware update successful [ 4.378148] ccp 0000:42:00.1: SEV API:1.51 build:3 [ 4.378157] ccp 0000:42:00.1: SEV-SNP API:1.51 build:3 [ 4.409337] SEV supported: 410 ASIDs [ 4.409337] SEV-ES and SEV-SNP supported: 99 ASIDs

Hmmm... this is from your earlier comment. Are you sure that your dmesg just hasn't been cleared? Otherwise, you wouldn't even be able to start launching an SNP guest.

Oh I figured that I disabled the IOMMU in BIOS and the RMP table message was not present. Now I enabled the IOMMU, the RMP message is back, but the SEV-SNP guest still reports the same invalid argument error

The number of DIMMs installed does not affect whether the RMP will be allocated or not.

If you're comfortable with modifying OVMF code, you can try adding some debug statements to QemuFlashInitialize() and QemuFlashDetected() in OvmfPkg/QemuFlashFvbServicesRuntimeDxe/QemuFlash.c. Just use a simple: DEBUG ((DEBUG_INFO, "* DEBUG: %a:%u\n", func, LINE**)); to track progress and see where the error occurs (assuming it reaches QemuFlashInitialize()).

Thank you for this detailed guide! I'm working on this debugging and will report what I find.

tlendacky commented 1 year ago

DEBUG ((DEBUG_INFO, "* DEBUG: %a:%u\n", func, LINE**));

Github changed what I typed, it should be: DEBUG ((DEBUG_INFO, "*** DEBUG: %a:%u\n", __func__, __LINE__));

aep commented 1 year ago

? If on your own, ensure that you are not building with SMM or Secure Boot enabled.

is there a deeper reason this is incompatible? a customer requires secureboot

tlendacky commented 1 year ago

The SMM support of Qemu/KVM requires the hypervisor to be able to change vCPU register state, which is not possible under SEV-ES/SEV-SNP.

aep commented 1 year ago

sorry, i meant to ask for secureboot only. My colleague found that disabling this made it work:

diff --git a/OvmfPkg/PlatformPei/Platform.c b/OvmfPkg/PlatformPei/Platform.c
index 148240342b..c292a44def 100644
--- a/OvmfPkg/PlatformPei/Platform.c
+++ b/OvmfPkg/PlatformPei/Platform.c
@@ -222,9 +222,9 @@ ReserveEmuVariableNvStore (
   VariableStore = (EFI_PHYSICAL_ADDRESS)(UINTN)PlatformReserveEmuVariableNvStore ();
   PcdStatus     = PcdSet64S (PcdEmuVariableNvStoreReserved, VariableStore);

- #ifdef SECURE_BOOT_FEATURE_ENABLED
-  PlatformInitEmuVariableNvStore ((VOID *)(UINTN)VariableStore);
- #endif
+// #ifdef SECURE_BOOT_FEATURE_ENABLED
+//  PlatformInitEmuVariableNvStore ((VOID *)(UINTN)VariableStore);
+// #endif

   ASSERT_RETURN_ERROR (PcdStatus);
 }
TheNetAdmin commented 1 year ago

DEBUG ((DEBUG_INFO, "* DEBUG: %a:%u\n", func, LINE**));

Github changed what I typed, it should be: DEBUG ((DEBUG_INFO, "*** DEBUG: %a:%u\n", __func__, __LINE__));

I inserted a few such debug prints to the file, and located the lines that caused the invalid argument error:

https://github.com/tianocore/edk2/blob/37d3eb026a766b2405daae47e02094c2ec248646/OvmfPkg/QemuFlashFvbServicesRuntimeDxe/QemuFlash.c#L66-L67

These two lines failed during the first iteration of this loop.

And I also noticed my current CPU (7443P) microcode is as follows

$ sudo dmesg | grep code
...
[    1.080107] microcode: CPU0: patch_level=0x0a00115d
...

This patch_level does not match anyone described in:

https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/tree/amd-ucode/README

Is the micro code a concern in this invalid argument error?

Is there any idea that I can further investigate this issue?

tlendacky commented 1 year ago

sorry, i meant to ask for secureboot only. My colleague found that disabling this made it work:

diff --git a/OvmfPkg/PlatformPei/Platform.c b/OvmfPkg/PlatformPei/Platform.c
index 148240342b..c292a44def 100644
--- a/OvmfPkg/PlatformPei/Platform.c
+++ b/OvmfPkg/PlatformPei/Platform.c
@@ -222,9 +222,9 @@ ReserveEmuVariableNvStore (
   VariableStore = (EFI_PHYSICAL_ADDRESS)(UINTN)PlatformReserveEmuVariableNvStore ();
   PcdStatus     = PcdSet64S (PcdEmuVariableNvStoreReserved, VariableStore);

- #ifdef SECURE_BOOT_FEATURE_ENABLED
-  PlatformInitEmuVariableNvStore ((VOID *)(UINTN)VariableStore);
- #endif
+// #ifdef SECURE_BOOT_FEATURE_ENABLED
+//  PlatformInitEmuVariableNvStore ((VOID *)(UINTN)VariableStore);
+// #endif

   ASSERT_RETURN_ERROR (PcdStatus);
 }

Just FYI, this has been reported and is being worked on upstream in the edk2-devel mailing list.

tlendacky commented 1 year ago

DEBUG ((DEBUG_INFO, "* DEBUG: %a:%u\n", func, LINE**));

Github changed what I typed, it should be: DEBUG ((DEBUG_INFO, "*** DEBUG: %a:%u\n", __func__, __LINE__));

I inserted a few such debug prints to the file, and located the lines that caused the invalid argument error:

https://github.com/tianocore/edk2/blob/37d3eb026a766b2405daae47e02094c2ec248646/OvmfPkg/QemuFlashFvbServicesRuntimeDxe/QemuFlash.c#L66-L67

These two lines failed during the first iteration of this loop.

I haven't seen this issue before, so I'm not sure why that is happening.

And I also noticed my current CPU (7443P) microcode is as follows

$ sudo dmesg | grep code
...
[    1.080107] microcode: CPU0: patch_level=0x0a00115d
...

This patch_level does not match anyone described in:

https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/tree/amd-ucode/README

Is the micro code a concern in this invalid argument error?

Shouldn't be, but you can always download that firmware file and place it in /lib/firmware/amd-ucode/ and see if that helps.

Is there any idea that I can further investigate this issue?

I would say to wait for the next level of SNP code that will be submitted upstream and try that. There should be corresponding updates to the stable-commits file.