VirtualGL / virtualgl

Main VirtualGL repository
https://VirtualGL.org
Other
701 stars 106 forks source link

Troubleshooting VirtualGL with NVIDIA GPU Operator in EKS #253

Closed Mohamed-ben-khemis closed 6 months ago

Mohamed-ben-khemis commented 6 months ago

Troubleshooting VirtualGL with NVIDIA GPU Operator in EKS

Issue Summary

Encountering issues with VirtualGL failing to detect GPUs within my EKS (Amazon Elastic Kubernetes Service) cluster using the NVIDIA GPU Operator. Despite confirming GPU presence with nvidia-smi, running glxgears with GPU acceleration using vglrun results in the following error:

vglrun -d /dev/nvidia0 glxgears
[VGL] ERROR: in init3D--
[VGL] 228: Invalid EGL device

Details

Issue

VirtualGL (vglrun) fails to initialize the 3D environment (glxgears) with an "Invalid EGL device" error when attempting GPU acceleration.

Questions

  1. How can I troubleshoot and resolve the issue of VirtualGL failing to detect and utilize GPUs within my container environment?
  2. Are there additional configurations or dependencies required to enable GPU acceleration with VirtualGL on EKS using the NVIDIA GPU Operator?

Additional Information

@ubuntu-fk5a8-91b4d208t9nxv:/etc/X11/xorg.conf.d$ nvidia-smi 
Fri May  3 11:14:58 2024       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.54.15              Driver Version: 550.54.15      CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  Tesla T4                       On  |   00000000:00:1E.0 Off |                    0 |
| N/A   25C    P8             14W /   70W |       0MiB /  15360MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|  No running processes found                                                             |
+-----------------------------------------------------------------------------------------+
dcommander commented 6 months ago

You already asked an identical question in #252, and I already answered it. You are using VirtualGL incorrectly. Now you are also using GitHub incorrectly by spamming me with duplicate issues. Please do not do that again, or you will be blocked from this project.