Closed accetto closed 3 years ago
@talent-less wrote on Apr 16, 2021:
The docker image works great, but it would be even better if it can run HW accelerated WebGL content or native apps (OpenGL) , like blender, etc.
FYI: https://medium.com/@pigiuz/hw-accelerated-gui-apps-on-docker-7fd424fe813e
@accetto wrote on Apr 20, 2021:
Thanks for the hint. I'll check it.
@accetto wrote on Apr 25, 2021:
Hello @talent-less ,
I'm working on it and expect that I'll publish some experimental images soon. At the moment it seems like an image with VirtualGL
and Blender
would be sufficient for testing, would it?
However, I'll not be able to test it to the end, because I currently do not own any computer with Nvidia GPU. Would you agree to help with testing?
If yes, then what would be your prefered scenario and what Nvidia drivers should I pre-install for you in the test images?
Regards, accetto
@talent-less wrote on Apr 26, 2021
@accetto Sure! Blender should be sufficient.
I have a Intel integrated-graphics and a NVIDIA GTX1050. Maybe I can also test it on other machines with more powerful graphics like 2080Ti. For the driver, I would try sudo apt install nvidia-driver-460
@accetto wrote on Apr 26, 2021:
Thanks, that will help. We'll probably need a few iterations, because it's not exactly my playground, but hopefully we'll reach something. :)
Do you maybe know, what exactly the nvidia-headless-460 drivers do? Is it something exclusively for the cloud (maybe) or is it a kind of superset/subset to the nvidia-driver-460
drivers?
@talent-less wrote on Apr 27, 2021:
I guess nvidia-headless-460 is for Ubuntu without the X server. Maybe I can try both on aws. But I think that shouldn't be very problematic, as the driver is only required on the host machine.
For gl libs in the docker you can maybe refer to
https://github.com/nytimes/rd-blender-docker/blob/master/dist/2.80-gpu-ubuntu18.04/Dockerfile
https://gitlab.com/nvidia/container-images/opengl/-/blob/ubuntu20.04/glvnd/runtime/Dockerfile
@accetto wrote on Apr 27, 2021:
Thanks for the hints. I've already a few alternatives developed that seem to work. However, it would help, if you could describe the first use case, you would like to test. How do you plan to use the container? Just locally or with a cloud? Do you plan to use it with VNC or do you want to share the X11 socket? Would you prefer TurboVNC or would be TigerVNC OK?
BTW: There will be an unexpected delay, because there is an ad-hoc problem with the TigerVNC download hosting (s. this issue). I have to configure a mirror and update all my containers, because it will be gone this weekend.
@talent-less wrote on Apr 28, 2021:
I will probably use it on a cloud platform as I don't have a NVIDIA GPU on my laptop. I think it will be very useful for people who want to perform long-running tasks like blender rendering while remaining a basic level of interactivity with the GUI instead of having the full thing running in headless mode.
It would be great if everything can be streamed via noVNC, just like in your firefox project. Can it be a layer on top of the firefox project? Then I can run both firefox and other OpenGL software using that image as starting point. If that is not possible, connecting with a VNC client via TCP should also be good.
I will not use the domain socket way, and I think it is already done by other containers
@accetto wrote on Apr 28, 2021:
Thanks, let's start with this use case. I plan to prepare a few versions, with Firefox
and also with Chromium
. They will include noVNC
and also TigerVNC
, because they go together in my containers. Then I'll include Mesa3D utilities and VirtualGL
. If I've understood you correctly, I should not install NVidia drivers inside the containers.
BTW, I always forget to mention it. You've filed the issue by the image accetto/ubuntu-vnc-xfce-firefox, which is a Generation 1 image based on Ubuntu 18.04 LTS
. However, I plan to release Generation 3 images, that are based on Ubuntu 20.04 LTS
. They are generally slimmer and faster. I'll probably put it all into the project accetto/headless-drawing-g3. Would it be OK? Is there no show stopper for you?
@talent-less wrote on Apr 30, 2021:
Sounds great! accetto/headless-drawing-g3 should be a better place for software like blender :)
@accetto wrote on May 1, 2021:
@talent-less , FYI
For the case that you've noticed that I've already started to publish Blender
images into the repository accetto/ubuntu-vnc-xfce-blender-g3 on Docker Hub. These are not yet the images we've discussed here.
Those Blender
images do not include any support for OpenGL/WebGl/VirtualGL
yet. You can use them later to compare the performance, for example.
However, because I have no experience with Blender
, I would appreciate, if you could test if something is missing there. Generally I try to keep the images as slim as possible, so I try to include only the really required packages at first.
But, for example, I've noticed that Blender
uses Python
and it seems, that it also installs it itself. However, it's still possible, that something is missing there.
Btw, you can always install additional packages, because sudo
is included. Do not forget to execute sudo apt-get update
first. The default sudo password is headless
.
@accetto wrote on May 3, 2021
I've just published the images with Blender
that include Mesa3D
utilities and VirtualGL
. Please go to the Docker Hub repository accetto/ubuntu-vnc-xfce-blender-g3 and look for the tags starting with 'vgl-'. Note that these images are not mentioned on the Readme page, because they are still experimental.
Unfortunatelly I cannot test it myself, because I currently do not have access to any Nvidia GPUs and I'm also missing experience with Blender
and VirtualGL
.
Few additional remarks still:
Mesa3D
utilities should support OpenGL
and WebGL
glxgears
(for OpenGL
) and es2gears
and es3tri
(for WebGL
) are also includedI've found that VirtualGL
can be used inside a container like:
vglrun -d :1 glxgears
vglrun -d :1 blender
I'm not sure if it's all what is needed, but it seems that the display argument is required. However, I hope that you already know what is needed.
The resources for the images are on GitHub in the experimental branch exp-vgl.
Please let me know if it is working and/or something is still to improve.
@latent-less wrote on May 5, 2021:
@accetto Hi, I tired sudo docker run --name blender -p 6901:6901 --gpus all --device=/dev/dri/card0:/dev/dri/card0 accetto/ubuntu-vnc-xfce-blender-g3:vgl-vnc-novnc-firefox-plus
https://get.webgl.org/ says the browser supports WebGL, and blender can run with or without vglrun -d :1 blender
over novnc. However, I didn't feel any speed up with vglrun
in the blender GUI.
apt install glmark2
and vglrun -d :1 glmakr2
crashed with
=======================================================
glmark2 2014.03+git20150611.fa71af2d
=======================================================
OpenGL Information
GL_VENDOR: Mesa/X.org
GL_RENDERER: llvmpipe (LLVM 11.0.0, 256 bits)
GL_VERSION: 3.1 Mesa 20.2.6
=======================================================
[build] use-vbo=false:X Error of failed request: BadMatch (invalid parameter attributes)
Major opcode of failed request: 131 (MIT-SHM)
Minor opcode of failed request: 3 (X_ShmPutImage)
Serial number of failed request: 44
Current serial number in output stream: 45
glmakr2
without vglrun
works, but it seems to use software rendering instead of the GPU.
glxinfo | grep vendor
says
server glx vendor string: SGI
client glx vendor string: Mesa Project and SGI
OpenGL vendor string: Mesa/X.org
vglrun -d :1 glxinfo | grep vendor
says
server glx vendor string: VirtualGL
client glx vendor string: VirtualGL
OpenGL vendor string: Mesa/X.org
but glxinfo crashed.
Then I tried sudo docker run --name blender -p 6901:6901 accetto/ubuntu-vnc-xfce-blender-g3:vgl-vnc-novnc-firefox-plus
without GPU, blender works fine, and the browser still says that I have WebGL.
apt install glmark2
and glamrk2
works .
=======================================================
glmark2 2014.03+git20150611.fa71af2d
=======================================================
OpenGL Information
GL_VENDOR: Mesa/X.org
GL_RENDERER: llvmpipe (LLVM 11.0.0, 256 bits)
GL_VERSION: 3.1 Mesa 20.2.6
=======================================================
It seems that there is a software OpenGL (SGL) somewhere trying to simulate the GPU.
I will try installing nvidia-driver-450-server
in the container (same as on the host machine) and see if it helps.
@talent-less wrote on May 5, 2021:
It's a bit strange that even before I install any driver in the container, nvidia-smi
is in the /usr/bin and it can detect the graphics card in the container without any problem. I assume the docker plugin mounts it for me. I also tried some Nvidia CUDA programs and they can run without problem.
It's just opengl that still falls back to the software rendering.
Do I need install X on the host and share it into the container with -v /tmp/.X11-unix/X0:/tmp/.X11-unix/X0:rw
like in https://medium.com/@benjamin.botto/opengl-and-cuda-applications-in-docker-af0eece000f1
@talent-less wrote on May 5, 2021:
I also tired sudo docker run --name blender --gpus all --device=/dev/dri/card0:/dev/dri/card0 -p 6901:6901 -v /tmp/.X11-unix/:/tmp/.X11-unix/:rw -e NVIDIA_VISIBLE_DEVICES=all -e NVIDIA_DRIVER_CAPABILITIES=graphics,utility,compute,video -d accetto/ubuntu-vnc-xfce-blender-g3:vgl-vnc-novnc-firefox-plus
but the docker image exit(0) immediately without producing any logs.
@accetto wrote on May 5, 2021:
Thanks for the feedback.
I'm still not quite sure about the configuration you use. Do you run the container on your lokal host or in a cloud? Do you try to use an NVidia card included in your host computer or do you try to use a GPU on a cloud? I've got an impression, that you've got a host with an Nvidia card and you're trying to share local devices with the container, is that right? And one more important thing - are you on Linux or on Windows?
I'm unable to repeat your scenarios, but this is what I've found when I've been experimenting on my Linux computer. Maybe it'll help.
The scenario is about running on local Linux computer and sharing X11 between the host computer and the container. A few remarks first:
:0
(not on :1
).--skip-vnc
. (However, if the VNC server is not running, then also the noVNC will not be unavailable.)Blender
requires also the sound device, if started with vglrun
.This is what has worked for me during testing. The best approach is to use two separate terminal windows.
Start a new container in the first terminal window:
# allow access to the local X server
xhost +local:$(whoami)
# start the container
docker run -it --rm \
-v /tmp/.X11-unix:/tmp/.X11-unix:rw \
-e DISPLAY=$DISPLAY \
--device /dev/snd:/dev/snd:rw \
--group-add audio \
--device /dev/dri/card0 \
--name devrun \
--hostname devrun \
accetto/ubuntu-vnc-xfce-blender-g3:vgl-vnc --skip-vnc
Then in the second terminal window connect to the running container and do what you need. For example:
# connect to the container
docker exec -it devrun bash
# install glmark2
sudo apt-get update
sudo apt-get install -y glmark2
# start glmark2 without vglrun
glmark2
# or with vglrun (should not crash this time)
vglrun glmark
# or blender
blender
# however, I've got crashes with blender with vglrun (maybe som more options are required)
vglrun blender
However, it can also be, that you simply need to install nvidia-docker on your host computer. Or maybe we're still missing something in the image.
Tomorrow I'll check the link you've send above.
Btw, sometimes it helps to start the containers with the --debug
option, your could get some hints about the problem.
@talent-less wrote on May 6, 2021:
I was testing on a clean Ubuntu 20.10 installation with latest docker and nvida-docker plugin on a physical machine at home without any virtualization.
I tired sharing the local X into the container with novnc enabled, I guess that's why it crashed. I don't think sharing local X with novnc disabled is what I would prefer, as I would like to share the container to a remote user.
@talent-less wrote on May 6, 2021:
@accetto were you able to get opengl working with novnc? I guess glmark should be showing the correct vendor (Intel, for example) instead of Mesa
@accetto wrote on May 6, 2021:
I guess glmark should be showing the correct vendor (Intel, for example) instead of Mesa
I'm not quite sure about this. I'm afraid that the nvidia-docker
would recognize and work only with Nvidia hardware (see the Note to the reader in this article).
I was testing on a clean Ubuntu 20.10 installation with latest docker and nvida-docker plugin on a physical machine at home without any virtualization.
We would need at least one system with Nvidia GPU to be able to test it all. Do you have an Nvidia GPU in your host? Alternatively, do you have an access to a cloud GPU (as descibed in this article)?
I don't think sharing local X with novnc disabled is what I would prefer, as I would like to share the container to a remote user.
I think that it would help if you could draw a diagram describing the configuration you would like to use. Then we could concentrate on that particular scenario. If you do not have a diagramming software and don't want to use an online one, then you can use the container accetto/ubuntu-vnc-xfce-drawio-g3.
@talent-less wrote on May 7, 2021:
We would need at least one system with Nvidia GPU to be able to test it all.
Sure. The Ubuntu machine has a 1050Ti Nvidia GPU.
I also tried some Nvidia CUDA programs and they can run without problem.
And I think I have got nvidia-docker plugin working, because I can run GPU program (CUDA programs) in your docker image. These program doesn't use opengl related libraries though.
I also have access to a cloud with 1080Ti GPUs, but I have to pay for using it. So I would like to test the image first on the local ubuntu machine.
However I would test as if the local machine were in the cloud, i.e. no physical display connected and display output is sent via novnc to my laptop.
@talent-less wrote on May 7, 2021:
I'm afraid that the nvidia-docker would recognize and work only with Nvidia hardware (see the Note to the reader in this article).
Sure, I know.
@accetto were you able to get opengl working with novnc? I guess glmark should be showing the correct vendor (Intel, for example) instead of Mesa
I asked this because I suspect that novnc is overriding the graphics hardware (nvidia in my case and intel in your case ) with software opengl from mesa.
@accetto wrote on May 7, 2021:
I also have access to a cloud with 1080Ti GPUs, but I have to pay for using it. So I would like to test the image first on the local ubuntu machine.
Exactly my view. :)
And I think I have got nvidia-docker plugin working, because I can run GPU program (CUDA programs) in your docker image. These program doesn't use opengl related libraries though.
That's a really good news! What has been needed to get it running? As I've already mentioned, OpenGL/Blender
is not exactly my playground, unfortunatelly. And not having any Nvidia hardware at the time makes me kind of one eye blind. :) However, I'll be glad to co-develop a useful image.
Do you have already some ideas what should/could we change in the image? I'll try to get some hints from the articles we've found, but it'll take some time.
Should I include some test apps into the images, like glmark2
, so you would not need to install them each time as new?
@talent-less wrote on May 7, 2021:
glmark2
should be a handy tool :)
I went through https://medium.com/@benjamin.botto/opengl-and-cuda-applications-in-docker-af0eece000f1 again, and it seems that some of libglvnd0 libgl1 libglx0 libegl1 libglvnd0 libgl1 libglx0 libegl1
are missing in the container.
According to https://gitlab.com/nvidia/container-images/opengl/-/blob/ubuntu20.04/glvnd/runtime/Dockerfile, both x86 and x64 version can be installed and libglvnd0 needs a json file
RUN apt-get update && apt-get install -y --no-install-recommends \
libglvnd0 libglvnd0:i386 \
libgl1 libgl1:i386 \
libglx0 libglx0:i386 \
libegl1 libegl1:i386 \
libgles2 libgles2:i386 && \
rm -rf /var/lib/apt/lists/*
COPY 10_nvidia.json /usr/share/glvnd/egl_vendor.d/10_nvidia.json
Here is how glvnd works: https://www.x.org/wiki/Events/XDC2016/Program/xdc-2016-glvnd-status.pdf
I am not sure if glvnd should be installed together with Mesa. Mesa seems to ship a software OpenGL and blender/firefox seems to use it instead of the real HW at the moment.
@talent-less wrote on may 7, 2021:
I am not sure if glvnd should be installed together with Mesa. Mesa seems to ship a software OpenGL and blender/firefox seems to use it instead of the real HW at the moment.
Oh that is out-dated.
In the process of migrating to Mesa 18.0, Canonical has updated their out-of-tree Mir patches for Mesa. They are also now enabling the OpenGL Vendor Neutral Dispatch Library "GLVND" that allows multiple OpenGL drivers to happily co-exist on the same system.
https://phoronix.com/scan.php?page=news_item&px=Ubuntu-18.04-Getting-Mesa-18.0
So maybe install libglvnd0 will make it work automagically.
@accetto wrote on May 7, 2021:
Thanks, I'll check it. I'll also include glmark2
. Let me know, if also some other test apps or utilities would be useful. We can remove them later if the images will become too big.
BTW, should I continue building all the tags I've published? For example, I'm not sure if you use by testing the images with chromium
and the ones without noVNC
.
@talent-less wrote on May 7, 2021:
I will only use the ones with Firefox and novnc. Maybe you can disable other ones for now and reenable them when the opengl issue is resolved.
@accetto wrote on May 7, 2021:
All the libraries you've listed above I've already included. You can check it by executing the following:
sudo apt-get update && \
apt search libglvnd0 ; \
apt search libgl1 ; \
apt search libglx0 ; \
apt search libegl1 ; \
apt search libgles2
Only the json file is not there because I didn't know about it. You can easily add it yourself and test it again. I'll also try to find out if something else could be missing.
Remark: Not all libraries are explicitely visible in the Dockerfile because many of them are installed as dependencies. In this case they are mostly part of the mesa
stuff.
@accetto wrote on May 8, 2021:
I've just published an updated image tag for you and removed the other ones. I can build them any time if you'll need them for testing.
You can download the image by:
docker pull accetto/ubuntu-vnc-xfce-blender-g3:vgl-vnc-novnc-firefox-plus
I've added the glmark2
test app and also the json
file you've pointed to. However, I've noticed, that there already has been the file 50_mesa.json
with exactly the same content. Unfortunatelly adding the file didn't solve the problem of crashing glmark2
if started with vglrun
.
The json
file points to the module libEGL_nvidia.so.0
, which is not installed in the image. I have tried to add it by installing the package libnvidia-gl-440
, but it didn't help with the crashing problem. I assume that nvidia-docker
and Nvidia hardware are still required.
One more tip for experimenting.
If you want to force the container's VNC to use the display #0, you can do it by overriding the VNC paramater in runtime, as it is described here. You can override also other VNC parameters and simplify you docker run
commands. Just be careful to bind a single file only, not a folder. The file content could look something like this, including the empty VNC password to avoid tipping it each time :)
export DISPLAY=:0
export VNC_PW=
# export DISPLAY=:2
# export VNC_COL_DEPTH=32
# export VNC_VIEW_ONLY=true
# export VNC_RESOLUTION=1024x768
# export VNC_PORT=5902
# export NO_VNC_PORT=6902
Then you can use, for example, just vglrun glxgears
in the container.
However, using the display #0 still doesn't help with the crashing vglrun glmark2
, at least not by me. Using the debug option like vglrun glmark2 -d
provides some output, but I still don't see the reason.
@accetto wrote on May 8, 2021:
I've played a little bit with the image accetto/ubuntu-vnc-xfce-blender-g3:vgl-vnc-novnc-firefox-plus and I may have something for you.
Try this:
xhost +local:$(whoami)
docker run -it --rm -P \
-v /tmp/.X11-unix:/tmp/.X11-unix:rw \
--device /dev/dri/card0 \
--device /dev/snd:/dev/snd:rw \
--group-add audio \
accetto/ubuntu-vnc-xfce-blender-g3:vgl-vnc-novnc-firefox-plus
xhost -local:$(whoami)
It seems to be promising, because glmark2
with vglrun
already correctly detects my video hardware and we do not loose the VNC/noVNC access you wanted.
This is how it looks like if you start glmark2
inside the container without vglrun
using just glmark2
command:
If you use vglrun glmark2
, then it looks already better:
Notice that there is no -d
option used with the vglrun
. It means, that glmark2
still runs on the display :0, leaving the display :1 for the VNC/noVNC access and the vglrun
passes the GLX commands through.
You can check it if you don't allow access to your local display :0 by leaving out the command xhost +local:$(whoami)
. Then you will get the following error:
Sharing the audio device is required for Blender
. You can test it the same way: vglrun blender
.
Please let me know if it worked.
@accetto wrote on May 10, 2021:
Hello @talent-less , does it already work?
@talent-less wrote on May 13, 2021:
@accetto Sorry for the late reply.
It doesn't seem to work on my side. The container exit immediately after running your command.
I think the local Ubuntu gnome session is doing something strange on the X domain socket file so the container crashed right after docker run. The container didn't put anything into stdout or stderr. Is there any way to get some logs?
However, I can start the container via SSH from my laptop when the ubuntu gnome session is logged out locally. But there I couldn't run xhost +local:$(whoami)
as xhost said "couldn't find display". I can ignore that and start container, but it ended up like in your last screenshot.
Do I need to somehow start the X server first?
@accetto wrote on May 13, 2021:
No, your X server is already running, if you are on Linux and if you already running a GUI desktop session.
Check it like this (on your host, not inside the container):
sudo ls /tmp/.X11*
### you should get something like this
X0
### check also the DISPLAY variable
echo $DISPLAY
### you should get something like this
:0.0
If xhost +local:$(whoami)
doesn't work, then try xhost +
, which grants access to everybody. Be sure that you run it on your host, not in the container. If it still doesn't work, then it could be a permission problem. Try to use sudo
then.
@accetto wrote on May 13, 2021:
Maybe some more tips about what you can do during troubleshooting.
You should probably check the readme file for startup options help. You can also display it by starting a container with the option -h
and also --help-usage
. You will probably find the options --debug
, --verbose
, --tail-vnc
, --skip-vnc
and --skip startup
most helpful.
You can bind the dockerstartup
directory as an external volume and you'll get the logs directly there. You can make modifications in the startup scripts, e.g. by adding some debug reporting, and then start the container with the --skip-startup
option and then execute the startup script from the second terminal. Something like this:
### start a container in the first terminal skipping the startup skript
docker run -it -P --name mytest <other-options> <image> --skip-startup bash
### connect from the second terminal ...
docker exec -it mytest bash
### ... and execute the startup script manually, possibly passing some startup options
headless@mytest:~$ /dockerstartup/./startup.sh --verbose
@talent-less wrote on May 13, 2021:
@accetto Thank you so much for the work and hints! I am able to get vglrun glmark2
and vglrun blender
working on a NVIDIA GPU !
======================================================
glmark2 2014.03+git20150611.fa71af2d
=======================================================
OpenGL Information
GL_VENDOR: NVIDIA Corporation
GL_RENDERER: GeForce GTX 1050 Ti/PCIe/SSE2
GL_VERSION: 4.6.0 NVIDIA 450.119.03
=======================================================
Here is the configuration for NVIDIA GPUs.
sudo docker run -it --rm --gpus all --name blender -p 6901:6901 -P -v /tmp/.X11-unix/X1:/tmp/.X11-unix/X0:rw --device /dev/dri/card0 --device /dev/snd:/dev/snd:rw -e NVIDIA_VISIBLE_DEVICES=all -e NVIDIA_DRIVER_CAPABILITIES=graphics,utility,compute,video --group-add audio accetto/ubuntu-vnc-xfce-blender-g3:vgl-vnc-novnc-firefox-plus
I didn't configure libglvnd so I guess it takes the default graphics card from the X server on the host.
My default DISPLAY was on X1 and it crashed the container previously. -v /tmp/.X11-unix/X1:/tmp/.X11-unix/X0:rw
fixs that.
So the last missing piece would be figuring out how to run the container in the cloud. @accetto Do you know how to run xserver and xhost from ssh on a headless Ubuntu Server instead of Ubuntu Desktop?
@accetto wrote on May 13, 2021:
I am able to get vglrun glmark2 and vglrun blender working on a NVIDIA GPU
That's a really great news! :+1:
Do you know how to run xserver and xhost from ssh on a headless Ubuntu Server instead of Ubuntu Desktop?
Not really, because I did not need it yet. Basically it should be similar to the desktop case, just without starting the Xfce4
stuff. Actually, the container is a headless server, so it should be possible to use it for testing. Check for example this article. You can always start a container skipping the startup script as described above and modify anything inside the container. The included nano
editor does nor require GUI. However, ssh
is not included, so you need to install it yourself.
You can save intermediate stages as helper images using docker commit
.
You can also try PuTTY
, which provides X11 forwarding.
@talent-less wrote on May 14, 2021:
I am a little bit confused now. Do I really need the X server on the host?
VirtualGL redirects an application's OpenGL/GLX commands to a separate X server (that has access to a 3D graphics card), captures the rendered images, and then streams them to the X server that actually handles the application.
On a headless sever, by default I have no X server talking to the GPU on the host. Can I somehow configure the X server in the container to use the GPU?
@accetto wrote on May 15, 2021:
Do I really need the X server on the host?
I would say so, but I'm actually also not 100% sure at the moment. I would recommend to start drawing some diagrams, to clarify what is running where and what is talking to what and how.
I think, that the use case we've just tested goes like this:
The Blender
application is running inside the container and it draws on the display 1. The container has no GPU and it's actually a kind of a remote host. The Nvidia GPU is available only on the local host and it can work only with display 0, as I've undestood. So we need VirtualGL
inside the container (= remote host) to pass the graphical commands and the rendered content between the displays 1 and 0. In other words, Blender
thinks that it's drawing on the display 1 and the GPU thinks that it's drawing on the display 0. The VirtualGL
makes them to believe that they are both right. :)
From that description I would derive, that both hosts (remote and local) need some kind of X server
, because they both want to draw on a graphical display. There is no need for GPU on a non-graphical terminals.
Well, maybe I've misunderstood something, but that's actually my point. I would draw some diagrams for the use case you need.
BTW, I would like to move this conversation to the project accetto/headless-drawing-g3, because it actually belongs there and this Generation 1 repository is slowly approaching its retirement. Another alternative would be the GitHub Discussions by the sibling repository accetto/ubuntu-vnc-xfce-g3. What do you think?
Hi, I have been using virtualGL for a while but I the performance was still not ideal compared to running programs directly on the host. glmark2 suggest a 15% loss on my machine. Also , the configuration requires x server to be installed and configured (xhost) on the host , which might be a headache for some cloud platforms (especially container-only ones) that do not provide you access to the host system at all.
Luckily, I found https://github.com/ehfd/docker-nvidia-glx-desktop which uses glvnd instead of virtualGL and therefore doesn't require x server on the host.
@accetto Are you interested in continuing improving of the container?
Hi, I have been using virtualGL for a while but I the performance was still not ideal compared to running programs directly on the host. glmark2 suggest a 15% loss on my machine.
This is no surprise, because it's generally true for any kind of virtualization. However, I'm not sure, if you really need to use VirtualGL
at all. I saw it always as an option. The example I've provided before was actually only a quick experiment, when I was testing if I can get my Intel GPU reported inside the container. Because I do not have NVidia, I'm not able to test the exact use case you're interested in, unfortunatelly. I also don't have any cloud subscription with GPU resources, so I'm not able to test such use cases either. I've hoped that you'll find out the best way to use it. :)
Also , the configuration requires x server to be installed and configured (xhost) on the host , which might be a headache for some cloud platforms
Actually I'm not sure if you need to share the X Server
socket and to use xhost
at all. I've provided such an example only because you've said that you want to test the GPU provided by your host. I've supposed that you'll not do the same on the cloud. This is why you wanted noVNC
after all, isn't it?
At the most beginning you've provided this link. In that article the author describes two main scenarios - sharing the X11 socket and installing the X server stuff inside the the container. The author prefers the first scenario, but my containers actually implement the second one, which is also more suitable for the cloud, I would assume.
Have you already tried to use my container the same way as the one you've mentioned above?
I haven't tried or analyzed it yet, but on the first look there should be no difference. The author self describes it as
container supporting GLX/Vulkan for NVIDIA GPUs by spawning its own X Server and noVNC WebSocket interface instead of using the host X server.
and this actually exactly the same what my containers do. The glnvd
is also available as part of the package libglvnd0
, which is already included. Only the noVNC
port is 6901
and I the NVidia drivers are not installed yet. But if only those are missing, then it would be no problem. Can you try to use my container exactly the same way? However, install the NVidia drivers first.
Are you interested in continuing improving of the container?
Sure I'm. I'll be glad to make the containers more useful also with applications that could benefit from using GPU, even if I don't have such use cases at the moment. My main use cases are more about encapsulated working environments for experimenting, testing and development. Because of not having an access to any NVidia GPU locally or on the cloud, I would welcome any help by testing such use cases, that I cannot do myself. Also I see Blender
as a generally useful application, but I don't have any experience with it.
Thanks for your reply.
I tried apt install kmod curl
installing nvidia driver using https://github.com/ehfd/docker-nvidia-glx-desktop/blob/cec9907cf2ad826aac53946e40bb9226fc4ea5b1/bootstrap.sh#L11 from bash but it failed with
ERROR: You appear to be running an X server; please exit X before installing.
For further details, please see the section INSTALLING THE NVIDIA DRIVER
in the README available on the Linux driver download page at
www.nvidia.com.
Then I tired it with docker run -it -P --name mytest <other-options> <image> --skip-startup bash
. The driver installed successfully but I still can not get the GPU working.
I also tired running a script before /dockerstartup/./startup.sh --verbose
but it didn't help.
#!/bin/bash
set -e
if [ "$NVIDIA_VISIBLE_DEVICES" == "all" ]; then
export GPU_SELECT=$(sudo nvidia-smi --query-gpu=uuid --format=csv | sed -n 2p)
elif [ -z "$NVIDIA_VISIBLE_DEVICES" ]; then
export GPU_SELECT=$(sudo nvidia-smi --query-gpu=uuid --format=csv | sed -n 2p)
else
export GPU_SELECT=$(sudo nvidia-smi --id=$(echo "$NVIDIA_VISIBLE_DEVICES" | cut -d ',' -f1) --query-gpu=uuid --format=csv | sed -n 2p)
if [ -z "$GPU_SELECT" ]; then
export GPU_SELECT=$(sudo nvidia-smi --query-gpu=uuid --format=csv | sed -n 2p)
fi
fi
if [ -z "$GPU_SELECT" ]; then
echo "No NVIDIA GPUs detected. Exiting."
exit 1
fi
if ! sudo nvidia-smi --id="$GPU_SELECT" -q | grep -q "Tesla"; then
DISPLAYSTRING="--use-display-device=None"
fi
HEX_ID=$(sudo nvidia-smi --query-gpu=pci.bus_id --id="$GPU_SELECT" --format=csv | sed -n 2p)
IFS=":." ARR_ID=($HEX_ID)
unset IFS
BUS_ID=PCI:$((16#${ARR_ID[1]})):$((16#${ARR_ID[2]})):$((16#${ARR_ID[3]}))
sudo nvidia-xconfig --virtual="1024x768" --depth="24" --mode="1024x768" --allow-empty-initial-configuration --no-use-edid-dpi --busid="$BUS_ID" --only-one-x-screen --no-xinerama "$DISPLAYSTRING"
if [ "x$SHARED" == "xTRUE" ]; then
export SHARESTRING="-shared"
fi
shopt -s extglob
for TTY in $(ls -1 /dev/tty+([0-9]) | sort -rV); do
if [ -w "$TTY" ]; then
Xorg vt"$(echo "$TTY" | grep -Eo '[0-9]+$')" :0 &
break
fi
done
sleep 1
export DISPLAY=:0
UUID_CUT=$(sudo nvidia-smi --query-gpu=uuid --id="$GPU_SELECT" --format=csv | sed -n 2p | cut -c 5-)
if vulkaninfo | grep "$UUID_CUT" | grep -q ^; then
VK=0
while true; do
if ENABLE_DEVICE_CHOOSER_LAYER=1 VULKAN_DEVICE_INDEX=$VK vulkaninfo | grep "$UUID_CUT" | grep -q ^; then
export ENABLE_DEVICE_CHOOSER_LAYER=1
export VULKAN_DEVICE_INDEX="$VK"
break
fi
VK=$((VK + 1))
done
else
echo "Vulkan is not available for the current GPU."
fi
I assume that the answer from above wasn't really ment for me, was it? :)
Have you already tried to use my container the same way as the one you've mentioned above? I haven't tried or analyzed it yet, but on the first look there should be no difference.
I was trying to manually perform what docker-nvidia-glx-desktop
does in your container, but I failed to get it to work.
I noticed that although docker-nvidia-glx-desktop
installs the NVIDIA driver and glvnd explicitly, it also uses FROM nvidia/opengl:1.2-glvnd-devel-ubuntu20.04
instead of the official Ubuntu image. I wonder if there is some magic sauce in the nvidia image that makes x11vnc happy.
I see. :)
I didn't have time yet to test or analyze the other image, but I've actually ment to use the same or similar kind of docker run
command line, without X11 socket sharing etc. I'm not sure if it would work right away, of course.
It can be, that there is something important inside the base image nvidia/opengl:1.2-glvnd-devel-ubuntu20.04
. I'll try to look at it also. Maybe the best trick would be to use that image as the base. However, I would not be able to really test it at the moment.
As a tip, you can install stuff like NVidia drivers in the run-time and then export the running container as a new image and then use that image for the next experiment and so on. Something like this:
Start a new container docker run --name mycont ...
Install the NVidia drivers inside the container:
### update the apt cache first
sudo apt-get update
### install the NVidia drivers
sudo apt install nvidia-driver-460
Then export the running container as a new image (from outside the container):
docker commit mycont my/image:nvidia
Then you can create new containers from the image my/image:nvidia
, install more stuff in the run-time, export important milestones as new images and so on.
It really doesn't help that I'm missing any NVidia hardware. I'm trying to find some quick and affordable solution, but no success yet.
I did some testing and I want to share the results.
I've used the glmark2
benchmark application and an image containing mesa-utils
and VirtualGL
.
First I've run glmark2
directly on my host and I've got the following result:
=======================================================
glmark2 2014.03+git20150611.fa71af2d
=======================================================
OpenGL Information
GL_VENDOR: Intel
GL_RENDERER: Mesa Intel(R) UHD Graphics 620 (KBL GT2)
GL_VERSION: 4.6 (Compatibility Profile) Mesa 20.2.6
=======================================================
...
=======================================================
glmark2 Score: 3936
=======================================================
Then I've run it inside the container, but using the display #0 of the host. The VNC must not be started in this case.
xhost +local:$(whoami)
docker run -it -P --rm \
-e DISPLAY=${DISPLAY} \
--device /dev/dri/card0 \
-v /tmp/.X11-unix:/tmp/.X11-unix:rw \
--name devrun \
<image-with-mesa-and-virtualgl> --skip-vnc
xhost -local:$(whoami)
From the second terminal:
docker exec -it devrun glmark2
The result has been:
=======================================================
glmark2 2014.03+git20150611.fa71af2d
=======================================================
OpenGL Information
GL_VENDOR: Intel
GL_RENDERER: Mesa Intel(R) UHD Graphics 620 (KBL GT2)
GL_VERSION: 4.6 (Compatibility Profile) Mesa 20.2.6
=======================================================
...
=======================================================
glmark2 Score: 4074
=======================================================
It's interesting that the result was even slightly better than in the first case, but it could be just a usual deviation.
Then I've tested it inside the container, not using the VisualGL
first. Software rendering was used in this case. I've accessed the container with TigerVNC Viewer
. The VNC was running, of course.
xhost +local:$(whoami)
docker run -it -P --rm \
--device /dev/dri/card0 \
-v /tmp/.X11-unix:/tmp/.X11-unix:rw \
--name devrun \
<image-with-mesa-and-virtualgl>
xhost -local:$(whoami)
The result has been significantly worse:
headless@devrun:~$ glmark2
** GLX does not support GLX_EXT_swap_control or GLX_MESA_swap_control!
** Failed to set swap interval. Results may be bounded above by refresh rate.
=======================================================
glmark2 2014.03+git20150611.fa71af2d
=======================================================
OpenGL Information
GL_VENDOR: Mesa/X.org
GL_RENDERER: llvmpipe (LLVM 11.0.0, 256 bits)
GL_VERSION: 3.1 Mesa 20.2.6
=======================================================
...
=======================================================
glmark2 Score: 378
=======================================================
It's just about 10% of the first result directly on the host! I've got the score of just 306 by accessing the container via noVNC
. That was another test run, of course. The slightly lower value is expectable, because there is an additional processing by websockify
.
At last I've run the test inside the container using vglrun glmark2
. The container has been started the same way as the one above. I've got the following result:
headless@devrun:~$ vglrun glmark2
=======================================================
glmark2 2014.03+git20150611.fa71af2d
=======================================================
OpenGL Information
GL_VENDOR: Intel
GL_RENDERER: Mesa Intel(R) UHD Graphics 620 (KBL GT2)
GL_VERSION: 4.6 (Compatibility Profile) Mesa 20.2.6
=======================================================
...
=======================================================
glmark2 Score: 549
=======================================================
It can be seen, that Intel UHD Graphics 620
has been used for rendering and that the result is about 45% better then the previous one. So it seem that VirtaulGL
is really working. By accessing the container via noVNC
I've got the score of 447.
Note that I've got only about 60% of the scores while using an external 4K monitor.
I would make the following conclusions from these tests:
NVIDIA Docker Toolkit
installed.vglrun
command from the VirtualGL
toolkit, because the performance gain seems to be about 45%. It can be probably much more with NVidia hardware and drivers.@talent-less FYI
I've included the Mesa3D
and VirtualGL
support into the master branch and therefore I've removed the experimental branch exp-vgl
. I've also re-published the images with Blender
and they all include the Mesa3D
libraries and the VirtualGL
toolkit now. The glvnd
is also included, of course. Please use those new images for experimenting.
I've also started a new discussion Supporting OpenGL/WebGL and using HW acceleration.
This issue has been originally filed by @talent-less as the issue #5 in the Generation 1 repository accetto/ubuntu-vnc-xfce-firefox. I've moved the complete conversation here, because the Generation 1 repo is approaching its retirement and in will be probably removed in the future. Also the subject fits better this repository.