Closed sdake closed 1 year ago
Issue within cloud-hypervisor
: https://github.com/cloud-hypervisor/cloud-hypervisor/issues/5319.
Thank you, -steve
I confirm that this does not happen with the proprietary driver package.
Try to use nvidia-vgpu with either a proprietary or open-source driver.
These two statements can't both be true.
@ttabi A simple thank you for spending 20 minutes of my life reporting a bug against your product offerings would be in order.
Thank you. -steve
Sorry for that, @sdake. Thanks for the report.
The open-gpu-kernel-modules do not yet support virtualization. We're currently working on it (it requires changes both in open-gpu-kernel-modules and in the GSP firmware); it may be a few releases before the support is added.
It arguably should be better called out, but the lack of virtualization support is mentioned, buried in the GPU driver README:
http://us.download.nvidia.com/XFree86/Linux-x86_64/535.98/README/kernel_open.html
Sorry for the inconvenience.
@aritger for the record, I know it's not the place, but GPU virtualization should REALLY not be locked out of consumer boards. At least let us vGPU one instance so some of us can virtualize a GPU-enabled Windows VM or something.
Thanks for your feedback. I will relay your message to the appropriate team.
@aritger Did you read my request? I will repeat it so we understand each other.
Your drivers, whether proprietary or open, lack vgpu support for any hypervisor other than QEMU. The broader accelerated computing community much prefer to use modern hypervisors, such as cloud-hypervisor. And unfortunately, the structure of the implementation of vgpu, even if paid, does not work with this hypervisor.
Thank you -steve
I made sure the vgpu team is aware of your concern. Thank you for bringing this to our attention.
Cool thanks Aaron! Super appreciate it. Github is a phenomenal tool to interact with technology suppliers. not sure how the other perspective.
Cheers, Steve
On Tue, Oct 10, 2023 at 12:05 PM Aaron Plattner @.***> wrote:
I made sure the vgpu team is aware of your concern. Thank you for bringing this to our attention.
— Reply to this email directly, view it on GitHub https://github.com/NVIDIA/open-gpu-kernel-modules/issues/548#issuecomment-1756065535, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAFYRCNCRUDIGBJW3BJDH3LX6WL7FAVCNFSM6AAAAAA3XPC34KVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTONJWGA3DKNJTGU . You are receiving this because you were mentioned.Message ID: @.***>
NVIDIA Open GPU Kernel Modules Version
535.86.10
Please confirm this issue does not happen with the proprietary driver (of the same version). This issue tracker is only for bugs specific to the open kernel driver.
Operating System and Version
Description: Debian GNU/Linux 12 (bookworm)
Kernel Release
Linux wise-a40x1-1 6.1.0-11-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.1.38-4 (2023-08-08) x86_64 GNU/Linux
Please confirm you are running a stable release kernel (e.g. not a -rc). We do not accept bug reports for unreleased kernels.
Hardware: GPU
NVIDIA A30 24G
Describe the bug
The NVIDIA VGPU software does not function with the
cloud-hypervisor
virtual machine monitor. An extensive analysis has been completed, and a summary has been produced.To Reproduce
Try to use
nvidia-vgpu
with either a proprietary or open-source driver. In either case, thenvidia-vgpu-vgpu
mdev
control plane has odd expectations about the command line for the virtual machine monitor.A very short summary is that
nvidia-vgpu
is hardcoded toQEMU
. Most modern accelerated compute startups appear to want to usecloud-hypervisor
as the technology has superior performance and quality.Bug Incidence
Always
nvidia-bug-report.log.gz
We are all working towards the same goal. I don't have the
nvidia-vgpu
software at this time. I will ask the individual that filed the issue to attachnvidia-bug-report.sh
output.@dengxuehua would you be kind enough to follow this issue tracker as well as provide the results of
nvidia-bug-report.sh
?More Info
No response