nanovms / ops

ops - build and run nanos unikernels
https://ops.city
MIT License
1.27k stars 132 forks source link

On-prem instances: add support for GPU passthrough #1528

Closed francescolavra closed 11 months ago

francescolavra commented 11 months ago

This change adds support for running on-prem instances with GPU devices. These devices must be connected to the host (which must be an x86_64 Linux machine) and are made available to the guest via PCI passthrough. To enable GPU passthrough on an on-prem instance, set the "GPUs" attribute of the "RunConfig" object in the Ops configuration to the number of GPUs to be made available to the instance. Example:

  "RunConfig": {
    "GPUs": 1
  }

GPU passthrough requires I/O virtualization to be supported and enabled in the host: this may require enabling VT-d (for Intel CPUs) or AMD IOMMU (for AMD CPUs) in the BIOS settings, and adding the "intel_iommu=on iommu=pt" (for Intel CPUs) or "amd_iommu=on iommu=pt" (for AMD CPUs) options to the Linux kernel command line. In addition, the vfio-pci Linux kernel driver must be loaded and bound to the GPU device(s) to be used by an instance: if the driver is built into the kernel binary, add "vfio-pci.ids=:" (where is the PCI vendor ID and is the PCI device ID of the host GPU) to the kernel command line, otherwise (if the driver is built as a kernel module) ensure that the driver is loaded with the "ids" option set to the PCI vendor and device ID of the GPU (for example, create a file named /etc/modprobe.d/vfio.conf which contains "options vfio-pci ids=10de:1eb8"). In order for VFIO devices to be accessible by the Qemu process, their file attributes must be properly configured, e.g. with the following udev rule: SUBSYSTEM=="vfio", OWNER="root", GROUP="kvm"