AMDESE / AMDSEV

AMD Secure Encrypted Virtualization
272 stars 84 forks source link

SEV Failed to INIT #176

Closed seedindream closed 10 months ago

seedindream commented 10 months ago

Hi, I am setting up my first SNP VM now. I encounter errors as follows:

[ 0.758838] SEV-SNP: RMP table physical address 0x0000000089f00000 - 0x00000000aa4fffff [ 2.945157] ccp 0000:43:00.1: sev enabled [ 2.986897] ccp 0000:43:00.1: SEV firmware update successful [ 3.014584] ccp 0000:43:00.1: SEV-SNP: failed to INIT error 0x3 [ 3.031919] ccp 0000:43:00.1: SEV: failed to INIT error 0x1, rc -5 [ 3.031942] ccp 0000:43:00.1: SEV API:1.51 build:3 [ 3.200995] SEV supported: 239 ASIDs [ 3.200995] SEV-ES and SEV-SNP supported: 14 ASIDs

I noticed that https://github.com/AMDESE/AMDSEV/issues/154 has the same problem. However, I didn't disable smt in my setup. In the BIOS, I enable SMEE/SNP under CPU common and I also enable SEV-SNP under NBIO common. Any ideas or suggestions are welcomed! Thanks

tlendacky commented 10 months ago

I see that it did a firmware update, what level of firmware are you updating from and to?

seedindream commented 10 months ago

I didn't intentionally update any firmware. The "SEV firmware update successful" just appears after I switch to the snp-host kernel & reboot. Any way I can help you locate those firmware version info?

tlendacky commented 10 months ago

The SEV firmware files are located in /lib/firmware/amd/. You could move them for one boot to see what the base firmware version is that is reported (since we see that after upgrade, the version is 1.51.3).

seedindream commented 10 months ago

This is the output after hiding those firmware files: [ 2.921509] ccp 0000:43:00.1: sev enabled [ 2.921512] ccp 0000:43:00.1: psp enabled [ 2.924699] ptdma 0000:02:00.2: enabling device (0000 -> 0002) [ 2.930835] ccp 0000:43:00.1: SEV-SNP support requires firmware version >= 1:51 [ 2.941865] ccp 0000:43:00.1: SEV: failed to INIT error 0x1, rc -5 [ 2.942058] ccp 0000:43:00.1: SEV API:1.23 build:23 [ 2.947887] Console: switching to colour dummy device 80x25

tlendacky commented 10 months ago

Ok, I think that firmware is actually too old (1.23.23) to properly upgrade from. Can you update the BIOS on that system to something more recent that will have a newer SEV firmware as its base?

seedindream commented 10 months ago

Thanks! Let me try to update the BIOS and get back to you in a few days.

seedindream commented 10 months ago

After updating bios, the previous ``failed to INIT" error is gone. But I encounter a new error like this: qemu-system-x86_64: warning: kvm_encrypt_reg_region: failed to set memory attr (0xffc84000+0x37c000) error 'Inappropriate ioctl for device' qemu-system-x86_64: warning: Failed to convert memory, invalid address: start 0xffc84000 size 0x37c000 shared_to_private 1, ret: -25 qemu-system-x86_64: SEV-SNP: failed to configure initial private guest memory

Is this because the qemu script? This is what my qemu script looks like:

    system("sudo /home/amd/AMDSEV/usr/local/bin/qemu-system-x86_64 \
    -enable-kvm -cpu EPYC-v4 -machine q35 -smp 1,maxcpus=1 -m 4096M,slots=5,maxmem=30G -no-reboot \
    -drive if=pflash,format=raw,unit=0,file=/home/amd/AMDSEV/snp-release-2023-08-20/usr/local/share/qemu/OVMF_CODE.fd,readonly \
    -drive if=pflash,format=raw,unit=1,file=/home/amd/snp_22.04/image.fd -netdev user,id=vmnic,hostfwd=tcp::7777-:22  \
    -device virtio-net-pci,disable-legacy=on,iommu_platform=true,netdev=vmnic,romfile= \
    -drive file=/home/amd/snp_22.04/image.qcow2,if=none,id=disk0,format=qcow2 \
    -device virtio-scsi-pci,id=scsi0,disable-legacy=on,iommu_platform=true \
    -device scsi-hd,drive=disk0 \
    -machine memory-encryption=sev0,vmport=off -object sev-snp-guest,id=sev0,cbitpos=51,reduced-phys-bits=1 \
    -nographic -monitor pty -monitor unix:monitor,server,nowait ");

The kernel running inside the guest and the host are matched, which is 5.19.0-rc6-snp-guest/host-c4daeffce56e I appreciate your help!

mdroth commented 10 months ago

The error message suggests you're trying to use a newer QEMU, so you need to update your host/guest kernels, and OVMF as well. In general, if you update one component, you need to update everything to the latest. This is a development tree so there is no backward-compatibility between individual components.

Also, if you're using a newer QEMU, you need to use a different set of command-line options. The launch-qemu.sh script serves as a useful reference for what your command-line should look like for any particular point-in-time build of the components managed by AMDSEV build scripts: https://github.com/AMDESE/AMDSEV/blob/snp-latest/launch-qemu.sh#L275

seedindream commented 10 months ago

Hi, I updated the kernel version and ensured I used the new-generated QEMU inside the AMDSEV folder. However, I encountered these new errors: The QEMU side suggests: sev_snp_launch_update:SNP_LAUNCH_UPDATE ret =-22 fw_error = 0

The dmesg suggests: [ 986.381740] kvm_amd: SEV-SNP requires restricted memory.

The dmesg related to snp and rmp during boot time:

[    0.804593] SEV-SNP: RMP table physical address [0x0000000089f00000 - 0x00000000aa4fffff]
[  122.610484] ccp 0000:43:00.1: sev enabled
[  124.645639] ccp 0000:43:00.1: SEV API:1.52 build:4
[  124.645647] ccp 0000:43:00.1: SEV-SNP API:1.52 build:4
[  124.650396] kvm_amd: SEV-ES and SEV-SNP supported: 19 ASIDs
[  124.650397] kvm_amd: SEV enabled (ASIDs 20 - 253)
[  124.650398] kvm_amd: SEV-ES enabled (ASIDs 1 - 19)

Another interesting observation is that, on the host side, after updating the host kernel to the newest host-kernel generated from the up-to-date "snp-latest" branch (6.5.0-rc2-snp-host-ad9c0bf475ec), some I/O drivers seem not to work well, e.g., (the NIC driver cannot work properly, the boot procedure will stuck for a long time when loading I/O drivers) Also, when I reboot the host, I always encounter "system-shutdown[1]: waiting for process: systemd-udevd, systemd-udevd" error, which blocks the host from rebooting. Those errors disappear after switching to other kernel versions. Any ideas? Thanks!

mdroth commented 10 months ago

"kvm_amd: SEV-SNP requires restricted memory."

This suggests you're missing the -machine ...,kvm-type=protected option. There are several command-line changes you need to make use of new kernel/qemu. Please update your QEMU options to align with what the launch-qemu.sh script does:

https://github.com/AMDESE/AMDSEV/blob/snp-latest/launch-qemu.sh#L275

If there are issues with I/O drivers they are likely related to the particular development kernel that the SNP patches are based on top of, but I can't really confirm without seeing more debug information since it isn't something I've seen on our systems.

seedindream commented 10 months ago

Thanks for your patience and kind help. I can launch a SNP-enabled VM now. The I/O driver problem turns out to be related to ``systemd-udevd", but it doesn't stop me from running snp-enabled VM.