AMDESE / AMDSEV

AMD Secure Encrypted Virtualization
272 stars 84 forks source link

[Help] Distro with SEV-SNP support already enabled? #197

Closed rjzak closed 8 months ago

rjzak commented 8 months ago

I'm trying to get my AMD Epyc 7313 system going with SEV-SNP, and I'm finding it difficult. I've tried Debian 12.2, Ubuntu Server 23.10, CentOS Stream 9 (and with the Fedora 39 beta 1 kernel), but it seems the SNP patches aren't available, or not enabled by any distro by default.

In short, what's the state of upstream having SEV-SNP support, and distros compiling in support for it? Essentially, I'm trying to use it with Enarx, and SEV is enabled, just not SNP.

✗ Backend: sev
  ✔ Driver: /dev/sev
  ✗  SEV-SNP is enabled in host kernel
  ✔ Driver: /dev/kvm
  ✔  API Version: 12
  ✔ CPU: AMD EPYC 7313 16-Core Processor                 | AuthenticAMD
  ✔  Microcode support: AMD EPYC 7313 16-Core Processor                
  ✔  Secure Memory Encryption (SME)
  ✔   Physical address bit reduction: 51
  ✔   C-bit location in page table entry: 19
  ✔  Secure Encrypted Virtualization (SEV)
  ✔   Number of encrypted guests supported simultaneously: 509
  ✔   Minimum ASID value for SEV-enabled, SEV-ES disabled guest: 1
  ✔  Secure Encrypted Virtualization Secure Nested Paging (SEV-SNP)
  ✔  Page Flush MSR available
  ✔ /dev/sev is readable by user
  ✔ /dev/sev is writable by user
  ✔ MEMLOCK rlimit allows for: ~1 keep (soft limit = 8388608 bytes, hard limit = 8388608 bytes)
  ✗ SEV-SNP VCEK key cache file: No such file or directory (os error 2)
  ✗ AMD CRL cache file: Error reading `/var/cache/amd-sev/crls.der`

Thank you!

tlendacky commented 8 months ago

SEV-SNP guest support was accepted into the Linux kernel in 5.19 and SEV-SNP hypervisor support is still in process of upstreaming. You will need to build the components needed depending on you kernel levels.

See: https://github.com/AMDESE/AMDSEV/tree/snp-latest

rjzak commented 8 months ago

Thanks!

Trundle commented 8 months ago

Note that Enarx doesn't work with the latest host changes (i.e. snp-host-latestbranch of AMDESE/linux). There is an open PR (https://github.com/enarx/enarx/pull/2522) to support a more recent host kernel, but it's also outdated by now (e.g. still without KVM_CREATE_GUEST_MEMFD).

rjzak commented 8 months ago

Thanks @Trundle. I'm trying to maintain Enarx, yet I'm not a kernel developer. Might you have any advice on how to keep Enarx updated with the latest host SEV-SNP patches? I do have hardware!

rjzak commented 7 months ago

@Trundle I'm having some SNP issues. I have a kernel and Enarx with patches from @Freax13 here https://github.com/enarx/enarx/pull/2552.

I can't get the system to see the SEV-SNP part. /dev/sev/ is there, KVM works. I'm trying to see what to investigate next:

rjzak commented 7 months ago

More info: with DMA remapping, I can boot Ubuntu 22.04 desktop (kernel 6.2) from a USB drive just fine, but fdisk cannot see the SATA controller.

Another thought: is secure boot required for SNP?

Sorry for asking these things here, I haven't been able to find info on this elsewhere.

Trundle commented 7 months ago

No parameter should be required for kvm_amd, the module has SEV-SNP support enabled by default if all requirements are met. You should also see in the kernel logs whether SEV-SNP is supported or only SEV or SEV-ES:

kernel: kvm_amd: SEV-ES and SEV-SNP supported: 499 ASIDs
kernel: kvm_amd: SEV enabled (ASIDs 500 - 509)
kernel: kvm_amd: SEV-ES enabled (ASIDs 1 - 499)

Some of the lines might be missing if the system doesn't support SEV-SNP and the actual values can differ of course.

In the BIOSes I've seen so far, there is typically a setting such as "SEV-ES ASID Space Limit" or "Minimum SEV non-ES ASID" that also might need to be changed. It influences how many SEV-ES/SEV-SNP vs non-ES VMs can be started. You also need to enable SME (Secure Memory Encryption). If no such settings are present in your BIOS, there is some chance an update adds them. You could check whether your vendor publishes update notes or a reference manual for BIOS settings to verify beforehand.

Secure boot is not required for SNP.

The latest CPU firmware is available at https://www.amd.com/en/developer/sev.html and likely packaged by your Linux distribution via linux-firmware

I can't really comment on the DMA issues, sorry.

tlendacky commented 7 months ago

In addition to the SMEE and Min SEV ASID settings, you will need to enable RMP memory coverage in BIOS. The OS does not allocate memory for the RMP, it relies on it being allocated by BIOS (found using the RMP_BASE and RMP_END MSRs). Also, SNP requires an active IOMMU (the IOMMU cannot be in passthrough mode) and there is a setting in BIOS to enable SNP support for the IOMMU. The output from: dmesg | egrep -i "ccp|sev|rmp" on the host/hypervisor might provide more insight.

rjzak commented 7 months ago

SMT: enabled

Virtualization: SR-IOV: Enabled DMA Remapping: Disabled Access Control Service: enabled (default)

Memory: TMSE: off (also tried on, no difference) Min ASID: 14 or 100, no effect Max ASID: 509 or 253, tried both

There's nothing in the BIOS at SNP or RMP, and I updated the BIOS firmware this week to the latest from HPE. But this seems to be the issue:

dmesg | egrep -i "ccp|sev|rmp"
[    0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-6.6.0-rc1+ root=UUID=1d6037c2-4b2a-4aa0-9d43-30ce67128fbe ro mem_encrypt=on kvm_amd.sev=1 kvm_amd.sev_snp=1 kvm.sev_es=1 sev=debug systemd.log_level=debug systemd.log_target=kmsg log_buf_len=15M
[    0.252095] Kernel command line: BOOT_IMAGE=/boot/vmlinuz-6.6.0-rc1+ root=UUID=1d6037c2-4b2a-4aa0-9d43-30ce67128fbe ro mem_encrypt=on kvm_amd.sev=1 kvm_amd.sev_snp=1 kvm.sev_es=1 sev=debug systemd.log_level=debug systemd.log_target=kmsg log_buf_len=15M
[    0.642786] SEV-SNP: Memory for the RMP table has not been reserved by BIOS
[   11.417518] systemd-udevd[570]: Reading rules file: /etc/udev/rules.d/99-sev.rules
[   11.589402] ccp 0000:42:00.1: enabling device (0140 -> 0142)
[   11.612210] ccp 0000:42:00.1: no command queues available
[   11.781190] ccp 0000:42:00.1: sev enabled
[   11.781208] ccp 0000:42:00.1: psp enabled
[   11.835228] ccp 0000:42:00.1: SEV firmware update successful
[   11.897649] ccp 0000:42:00.1: SEV API:1.55 build:14
[   11.993745] ccp 0000:42:00.1: SEV API:1.55 build:14
[   12.065617] kvm: unknown parameter 'sev_es' ignored
[   12.116928] kvm_amd: SEV-ES supported: 99 ASIDs
[   12.116931] kvm_amd: SEV enabled (ASIDs 100 - 253)
[   12.116934] kvm_amd: SEV-ES enabled (ASIDs 1 - 99)
rjzak commented 7 months ago

2023-11-17 17 11 51 2023-11-17 17 11 42 2023-11-17 17 01 21 2023-11-17 17 13 
![2023-11-17 17 1
![2023-11-17 17 12 41](https://github.com/AMDESE/AMDSEV/assets/1133882/d30b019f-d9a0-4999-8388-2dd8334ad749)
2 57](https://github.com/AMDESE/AMDSEV/assets/1133882/154d9db5-84a2-43c7-80cd-e0826597a744)
45 2023-11-17 17 13 05

tlendacky commented 7 months ago

SMT: enabled

This setting is irrelevant.

Virtualization: SR-IOV: Enabled DMA Remapping: Disabled

You will need this to be enabled for SNP. When set to enabled, does it expose any new SNP related options?

Access Control Service: enabled (default)

Memory: TMSE: off (also tried on, no difference) Min ASID: 14 or 100, no effect Max ASID: 509 or 253, tried both

There's nothing in the BIOS at SNP or RMP, and I updated the BIOS firmware this week to the latest from HPE. But this seems to be the issue:

dmesg | egrep -i "ccp|sev|rmp" [ 0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-6.6.0-rc1+ root=UUID=1d6037c2-4b2a-4aa0-9d43-30ce67128fbe ro mem_encrypt=on kvm_amd.sev=1 kvm_amd.sev_snp=1 kvm.sev_es=1 sev=debug systemd.log_level=debug systemd.log_target=kmsg log_buf_len=15M [ 0.252095] Kernel command line: BOOT_IMAGE=/boot/vmlinuz-6.6.0-rc1+ root=UUID=1d6037c2-4b2a-4aa0-9d43-30ce67128fbe ro mem_encrypt=on kvm_amd.sev=1 kvm_amd.sev_snp=1 kvm.sev_es=1 sev=debug systemd.log_level=debug systemd.log_target=kmsg log_buf_len=15M [ 0.642786] SEV-SNP: Memory for the RMP table has not been reserved by BIOS

Without the RMP being allocated by BIOS, SNP won't be enabled. Looks like you'll need to contact HPE about SNP support in the BIOS.

[ 11.417518] systemd-udevd[570]: Reading rules file: /etc/udev/rules.d/99-sev.rules [ 11.589402] ccp 0000:42:00.1: enabling device (0140 -> 0142) [ 11.612210] ccp 0000:42:00.1: no command queues available [ 11.781190] ccp 0000:42:00.1: sev enabled [ 11.781208] ccp 0000:42:00.1: psp enabled [ 11.835228] ccp 0000:42:00.1: SEV firmware update successful [ 11.897649] ccp 0000:42:00.1: SEV API:1.55 build:14 [ 11.993745] ccp 0000:42:00.1: SEV API:1.55 build:14 [ 12.065617] kvm: unknown parameter 'sev_es' ignored

That should be kvm_amd.sev_es=1. But, really, none of those are required because the default value for sev, sev_es, and sev_snp is true, now. They will be set to false during module load if they can't be enabled.

[ 12.116928] kvm_amd: SEV-ES supported: 99 ASIDs [ 12.116931] kvm_amd: SEV enabled (ASIDs 100 - 253) [ 12.116934] kvm_amd: SEV-ES enabled (ASIDs 1 - 99)

Freax13 commented 7 months ago

Would it be possible for the OS to initialize the RMP_BASE and RMP_END MSRs instead of relying on the BIOS? I don't see anything in the manuals that would prevent this, but it's also very possible that I missed something.

rrelph commented 7 months ago

Tom, You’ll want to learn how to update the HPE BIOS. I think this is the one you want: https://support.hpe.com/connect/s/softwaredetails?language=en_US&softwareId=MTX_e87fa7295f974fa6ae1d1303fe

It’s entirely possible that earlier BIOSes didn’t have support for SNP and you may see different options in the BIOS after the upgrade.
I believe if you have a block of memory available for an RMP large enough to cover all of memory, and Linux *does not know about* that block, that it would be possible to set RMP_BASE and RMP_END to refer to the block, then unload and reload all the relevant Linux drivers, that might work. I’ve used the mem= option to limit the amount of memory Linux is aware of, knowing I have more than that quantity of memory in the system, then setting RMP_BASE and RMP_END accordingly to refer to the memory above what Linux knows about.
But I strongly recommend getting the latest and greatest BIOS for your system and configuring the RMP via the BIOS.

Richard

On Dec 9, 2023, at 3:44 PM, Tom Dohrmann @.***> wrote:

Would it be possible for the OS to initialize the RMP_BASE and RMP_END MSRs instead of relying on the BIOS? I don't see anything in the manuals that would prevent this, but it's also very possible that I missed something.

— Reply to this email directly, view it on GitHub https://github.com/AMDESE/AMDSEV/issues/197#issuecomment-1848717672, or unsubscribe https://github.com/notifications/unsubscribe-auth/AC244DR2KYMF6VVBW3KZ5FLYITLTHAVCNFSM6AAAAAA6M7GOZGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQNBYG4YTONRXGI. You are receiving this because you are subscribed to this thread.

rjzak commented 7 months ago

@rrelph I did update the BIOS on the machine in question and it didn't help, and there's no option for SNP. But somehow this machine was used with SNP before, but I'm unable to figure out how. I'll work with @Freax13 to see if we can figure out this strategy.

rrelph commented 7 months ago

OK… so forgive the question… what model EPYC CPU do you have in the system? Just dotting all the Is...

On Dec 9, 2023, at 4:50 PM, Richard Zak @.***> wrote:

@rrelph https://github.com/rrelph I did update the BIOS on the machine in question and it didn't help, and there's no option for SNP. But somehow this machine was used with SNP before, but I'm unable to figure out how. I'll work with @Freax13 https://github.com/Freax13 to see if we can figure out this strategy.

— Reply to this email directly, view it on GitHub https://github.com/AMDESE/AMDSEV/issues/197#issuecomment-1848771644, or unsubscribe https://github.com/notifications/unsubscribe-auth/AC244DVJWXCAEIQHZAYULXDYITTL5AVCNFSM6AAAAAA6M7GOZGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQNBYG43TCNRUGQ. You are receiving this because you were mentioned.

rjzak commented 7 months ago

CPU: AMD EPYC 7313 16-Core Processor (family: 0x19, model: 0x1, stepping: 0x1) Server: HPE ProLiant DL385 Gen10 Plus v2

tlendacky commented 6 months ago

Would it be possible for the OS to initialize the RMP_BASE and RMP_END MSRs instead of relying on the BIOS? I don't see anything in the manuals that would prevent this, but it's also very possible that I missed something.

It would be possible for the OS to set RMP_BASE and RMP_END MSRs, but Linux would have to reserve a contiguous block of memory large enough to cover all of system RAM and so that's why we rely on BIOS.

CPU: AMD EPYC 7313 16-Core Processor (family: 0x19, model: 0x1, stepping: 0x1) Server: HPE ProLiant DL385 Gen10 Plus v2

Can you share the output of: cpuid -1 -r -l 0x8000001f

rjzak commented 6 months ago
❯ cpuid -1 -r -l 0x8000001f
CPU:
   0x8000001f 0x00: eax=0x0101fd3f ebx=0x00004133 ecx=0x000000fd edx=0x0000000f
Freax13 commented 6 months ago

Here's a patch that allocates the RMP table at boot time if the alloc_rmp kernel parameter is set: https://github.com/Freax13/linux/commit/5d517728f9ed27dad1879efbffbc2356ce63af41.

For some reason I had to make sure that the RMP table is 2MiB aligned, otherwise I'd get seemingly spurious RMP faults for memory in the hypervisor state (I also confirmed this by inspecting the RMP table). I got these faults for addresses within 1MiB after the RMP end. AFAIK the CPU wants the RMP to be 8KiB aligned and the firmware requires 1MiB alignment, but I didn't see anything about a 2MiB alignment requirement. Is this a known issue or did I maybe mess up something else?

rjzak commented 6 months ago

I was able to finally get SNP to work. After weeks of working with HPE, they were able to help me to get SNP enabled, and @Freax13's V10 enabled Linux kernel and Enarx patches worked! Thanks for helping me with this.