VirtualGL / virtualgl

Main VirtualGL repository
https://VirtualGL.org
Other
701 stars 106 forks source link

xpra vglrun glxinfo OpenGL not selecting nvidia for renderer #137

Closed DevinBayly closed 4 years ago

DevinBayly commented 4 years ago

Hello there,

I'm setting up a means to give users of an HPC access to Graphical Applications using nvidia hardware included in allocated sessions, however all the configuration I've performed hasn't changed the default renderer that OpenGL is trying to use OpenGL Renderer: llvmpipe (LLVM 9.0, 256 bits).

I've performed the following steps, please let me know if I'm missing anything following the instructions here https://virtualgl.org/Documentation/HeadlessNV we ran nvidia-xconfig --query-gpu-info which showed an nvidia gpu at PCI:32:0:0

then we ran nvidia-xconfig -a --allow-empty-initial-configuration --virtual=1920x1200 --busid PCI:32:0:0

and that made a new /etc/X11/xorg.conf with the following contents

# nvidia-xconfig: X configuration file generated by nvidia-xconfig
# nvidia-xconfig:  version 450.51.06

Section "ServerLayout"
    Identifier     "Layout0"
    Screen      0  "Screen0"
    Screen      1  "Screen1" RightOf "Screen0"
    InputDevice    "Keyboard0" "CoreKeyboard"
    InputDevice    "Mouse0" "CorePointer"
EndSection

Section "Files"
EndSection

Section "InputDevice"
    # generated from default
    Identifier     "Mouse0"
    Driver         "mouse"
    Option         "Protocol" "auto"
    Option         "Device" "/dev/input/mice"
    Option         "Emulate3Buttons" "no"
    Option         "ZAxisMapping" "4 5"
EndSection

Section "InputDevice"
    # generated from default
    Identifier     "Keyboard0"
    Driver         "kbd"
EndSection

Section "Monitor"
    Identifier     "Monitor0"
    VendorName     "Unknown"
    ModelName      "Unknown"
    Option         "DPMS"
EndSection

Section "Monitor"
    Identifier     "Monitor1"
    VendorName     "Unknown"
    ModelName      "Unknown"
    Option         "DPMS"
EndSection

Section "Device"
    Identifier     "Device0"
    Driver         "nvidia"
    VendorName     "NVIDIA Corporation"
    BoardName      "Tesla K20Xm"
    BusID          "PCI:32:0:0"
EndSection

Option "HardDPMS" "false"
Section "Device"
    Identifier     "Device1"
    Driver         "nvidia"
    VendorName     "NVIDIA Corporation"
    BoardName      "Tesla K20Xm"
    BusID          "PCI:32:0:0"
EndSection

Section "Screen"
    Identifier     "Screen0"
    Device         "Device0"
    Monitor        "Monitor0"
    DefaultDepth    24
    Option         "AllowEmptyInitialConfiguration" "True"
    SubSection     "Display"
        Virtual     1920 1200
        Depth       24
    EndSubSection
EndSection

Section "Screen"
    Identifier     "Screen1"
    Device         "Device1"
    Monitor        "Monitor1"
    DefaultDepth    24
    Option         "AllowEmptyInitialConfiguration" "True"
    SubSection     "Display"
        Virtual     1920 1200
        Depth       24
    EndSubSection
EndSection

Then I tried a couple of things with xpra, but in both cases the vglrun command didn't change the renderer to the nvidia hardware.

I launched an xterm window from this same interactive session, and tried to run vglrun glxspheres64 but it still showed the line that the OpenGL renderer was using llvm pipe

Polygons in scene: 62464 (61 spheres * 1024 polys/spheres)
Visual ID of window: 0x21
Context is Direct
OpenGL Renderer: llvmpipe (LLVM 9.0, 256 bits)

and the fps was around 20, and no programs were active in the nvidia-smi report. I'm convinced this isn't an issue with xpra, but with the way I configured the VirtualGL, anything obvious I'm missing?

dcommander commented 4 years ago

Nothing obvious. Try the "Sanity Check" procedure to verify that you are able to access the GPU through the 3D X server: https://cdn.rawgit.com/VirtualGL/virtualgl/2.6.4/doc/index.html#hd006002001

Also check the environment and make sure someone hasn't done something stupid, such as setting VGL_DISPLAY or VGL_GLLIB incorrectly.

dcommander commented 4 years ago

Also, it would be helpful if you could reproduce the same error using another X proxy. It's entirely possible that Xpra is attempting to do something novel with VirtualGL and is somehow messing things up. If this issue is specific to Xpra, then it is a matter for the Xpra developers.

totaam commented 4 years ago

xpra maintainer here, he has already asked us.

This is not an xpra bug or a virtualgl bug IMO, just a setup issue. He's using a singularity container and probably does not have access to an accelerated X11 server to begin with.

It's entirely possible that Xpra is attempting to do something novel with VirtualGL and is somehow messing things up.

No. We don't do anything like that. vglrun glxgears from an xterm running via xpra should always work if virtualgl is setup correctly.

dcommander commented 4 years ago

@DevinBayly If you can confirm that you are using a container, then that info should have been included in your post, and if you can confirm that VirtualGL works without the container, then this is not our problem.

DevinBayly commented 4 years ago

Hi Antoine!

I'll have to work with the infrastructure team so that we can get access to the X11 server.

Yes, I'm using a singularity container.

But we haven't been able to make a glxgears window with the HPC's gpu card used for rendering even without the singularity container. That was the initial reason I started looking into xpra and singularity.

I'll check out the Sanity check that you recommend. I think its going to be a bit more complex since the display variable only exists when I run the actual xpra commands, but I believe I'll be able to pass these commands as a script to execute

xauth merge /etc/opt/VirtualGL/vgl_xauth_key
xdpyinfo -display :0
/opt/VirtualGL/bin/glxinfo -display :0 -c

or this

xdpyinfo -display :0
/opt/VirtualGL/bin/glxinfo -display :0 -c
totaam commented 4 years ago

But we haven't been able to make a glxgears window with the HPC's gpu card used for rendering even without the singularity container.

Then you should really fix this problem before making it more complicated.

DevinBayly commented 4 years ago

good point, I'll try to focus on that before trying to bring xpra or singularity into the mix. I'll close this, and if other issues come up I'll submit them separately.

crazyleeth commented 3 years ago

hello there. I meet some similiar problems .the VGL on my server can only work without nvidia hardware.it use that llvm . howerver display port :1 can use the nvidia driver.but the port created by turbovnc can only use integrated video card. my system is ubuntu 18.04.6 using gdm3 and the graphic card is 10 2080ti. the driver version is about 450 .The vgl version is 2.6.5. when I follow the guidence. I cant find the file "vgl_xauth_key" anyway.would it be the key problem? looking forward to your reply~

dcommander commented 3 years ago

@crazyleeth Please do not hijack other issues, particularly issues that are closed and which may or may not be related to yours. Post a new issue.