google / gvisor

Application Kernel for Containers
https://gvisor.dev
Apache License 2.0
15.81k stars 1.3k forks source link

nvproxy: Support NVIDIA driver 550.127.05 #11111

Open danielnorberg opened 2 days ago

danielnorberg commented 2 days ago

Description

CVE‑2024‑0126 affects NVIDIA Linux GPU v550.x drivers prior to 550.127.05. Latest v550.x driver supported by gVisor is 550.90.07.

Is this feature related to a specific bug?

No response

Do you have a specific solution in mind?

Add support for NVIDIA Linux GPU driver 550.127.05.

jseba commented 2 days ago

Relatedly, does the gvisor team have a communication channel open with nvidia related to CVEs so that if there are potential ABI breaking changes needed to be pulled in to patch vulnerabilities, you all have an appropriate heads up and can coordinate an update?

ayushr2 commented 2 days ago

IIUC CVE‑2024‑0126 is about the NVIDIA GPU Display Driver containing a vulnerability. gVisor doesn't support display driver capability itself: https://github.com/google/gvisor/blob/6adc0720b2e66d3dee7e115d93ec3347f9a8a212/pkg/sentry/devices/nvproxy/nvconf/caps.go#L53-L70

I am not sure if GPU display workloads would even work inside gVisor.

ayushr2 commented 2 days ago

Relatedly, does the gvisor team have a communication channel open with nvidia related to CVEs so that if there are potential ABI breaking changes needed to be pulled in to patch vulnerabilities, you all have an appropriate heads up and can coordinate an update?

As per https://gvisor.dev/docs/user_guide/gpu/#driver-versions gVisor tracks the driver versions available in GKE. I believe GKE has the right processes & channels in place to be notified of security vulnerabilities. When GKE make such security updates, gVisor is notified to add support for needed versions.

ayushr2 commented 2 days ago

IIUC CVE‑2024‑0126 is about the NVIDIA GPU Display Driver containing a vulnerability.

Actually, @nixprime pointed out that NVIDIA seems to always refers to the GPU driver as the "GPU Display Driver" in CVEs (maybe due to legacy reasons). So nvproxy might be impacted.

We are tracking this.