AMDESE / AMDSEV

AMD Secure Encrypted Virtualization
297 stars 86 forks source link

SEV and IOMMU #80

Open hallojs opened 2 years ago

hallojs commented 2 years ago

Hi, I want to use an NVIDIA GPU with CUDA together with SEV. Therefore I have enabled IOMMU. With SEV and SEV-SE, I was able to start the VM install the required drivers, and the GPU was detected correctly. Now, I have the problem that data I want to copy from the SEV VM to the GPU for parallel processing is encrypted, causing the computations on the GPU to produce wrong results. Additionally, after the computations on the GPU are complete, I would like to transfer the unencrypted results back to the SEV VM. My question is whether such a use case is considered in SEV and what might be the best way to transfer the data to the GPU and back unencrypted.

Besides, I can't start the VM with SEV-SNP when IOMMU is active. Is that intended, or am I doing something wrong?

Thanks!

fenghao176 commented 2 years ago

@hallojs I tried to use Tesla P4 in SEV VM using VFIO, but couldn't successfully install the GPU driver, which NVIDIA GPU do you use?

hallojs commented 2 years ago

Hi @fenghao2021, we use an Nvidia T4, but I think there should be no difference between the GPU models in that respect.

fenghao176 commented 2 years ago

@hallojs Tesla P4 driver doesn't support VFIO usage in SEV VM, so glad to know that Nvidia T4 can be used.

hallojs commented 2 years ago

@fenghao2021 I don't think that the driver support of the two GPUs differs in terms of SEV. We still have the problem that we can't get the data unencrypted from the SEV VM into the GPU memory.

fenghao176 commented 2 years ago

@hallojs which driver are you using? I will give a try once I get a T4 hardware. I remember there is a kernel function set_memory_decrypted(), maybe you can allocate some memory in kernel and call set_memory_decrypted() to set the memory as decrypted, then use that memory to do GPU processing, not sure if it is suitable for your use case.

hallojs commented 2 years ago

@fenghao2021 I just checked it, and we use Driver 510.47.03. We installed it together with the CUDA-Toolkit: https://developer.nvidia.com/cuda-downloads.

Ah yes, that could work. We switched to Intel SGX for our current use case, but we may come back to that later. Thanks! 😊

wdsun1008 commented 1 year ago

@fenghao2021 I just checked it, and we use Driver 510.47.03. We installed it together with the CUDA-Toolkit: https://developer.nvidia.com/cuda-downloads.

Ah yes, that could work. We switched to Intel SGX for our current use case, but we may come back to that later. Thanks! 😊

@hallojs Are you still following the work on GPU + SNP? Recently, I tried the combination of Nvidia 3090 and SEV, and encountered a similar issue. After GPU processing, the data turns into ciphertext. However, when I examined the code of Nvidia Open GPU Kernel Modules, I found that Nvidia has implemented checks and processing for SEV, presumably decrypting the relevant memory. When I use SNP, I encounter error #177 , which seems to occur when memory is set as shared and pvalidate is called, resulting in the memory being invalidated.

zvonkok commented 1 year ago

NVIDIA released the early access of the Confiential Compute stack, enabling H100 GPUs with SEV-SNP. https://www.nvidia.com/en-us/data-center/solutions/confidential-computing/ You need the proper HW and SW to make SEV-SNP work.

hallojs commented 1 year ago

Hi @wdsun1008, for our application, we finally switched to Intel SGX. This works very well. As @zvonkok mentioned, there are now also solutions from Nvidia to integrate the GPU into the trust base of the TEE. I'm unsure whether you can enable Nvidia Confidential Computing on your own HW, but I guess there are options at Microsoft Azure.