NVIDIA / nvtrust

Ancillary open source software to support confidential computing on NVIDIA GPUs
Apache License 2.0
189 stars 26 forks source link

CVM not detecting the second GPU #62

Open iihihiuh opened 2 months ago

iihihiuh commented 2 months ago

Dear all,

I am trying to run distributed workloads on multiple GPUs in the CC mode. However, my CVM cannot detect the second GPU I input using -device. The CVM only loads the whichever is the first GPU into the CVM. Below is the script I used to launch CVM. Can anyone let me know what more options I should include?

/usr/bin/qemu-system-x86_64 -accel kvm -name process=tdxvm,debug-threads=on -m 128G -vga none -monitor pty -no-hpet -nodefaults -drive file=/home/yongqin/Drivers/tdx-tools/build/ubuntu-22.04/guest-image/td-guest-ubuntu-22.04.qcow2,if=virtio,format=qcow2 -monitor telnet:127.0.0.1:9002,server,nowait -bios /usr/share/qemu/OVMF.fd -object tdx-guest,sept-ve-disable,id=tdx -object memory-backend-memfd-private,id=ram1,size=128G -cpu host,-kvm-steal-time,pmu=off -machine q35,kernel_irqchip=split,confidential-guest-support=tdx,memory-backend=ram1 -device virtio-net-pci,netdev=mynet0 -smp 32 -netdev user,id=mynet0,hostfwd=tcp::10026-:22 -chardev stdio,id=mux,mux=on,logfile=/home/yongqin/Drivers/tdx-tools/vm_log_2024-07-16T1845.log -device virtio-serial,romfile= -device virtconsole,chardev=mux -monitor chardev:mux -serial chardev:mux -nographic -no-hpet -nodefaults -device pcie-root-port,id=pci.1,bus=pcie.0 -device vfio-pci,host=b0:00.0,bus=pci.1 -device vfio-pci,host=b1:00.0,bus=pci.1 -fw_cfg name=opt/ovmf/X-PciMmio64Mb,string=262144

Specifically, I use those 2 options to include 2 of my GPUS: -device vfio-pci,host=b0:00.0,bus=pci.1 -device vfio-pci,host=b1:00.0.

For some reasons, the CVM can only pass through the first GPU (I switched their orders, and CVM only load whoever comes the first).

Thanks in advance.

Tan-YiFan commented 1 month ago

According to Nvidia CC GA release note:

Only one GPU per VM is allowed. Multiple GPUs assigned to a VM will produce undefined behavior.

It seems that Nvidia CC does not support multi-GPUs this this release.