Container doesn't start with volume .X11-unix:rw

uflik commented 2 years ago

Hello,

I wanted to use the vnc-novnc-mesa-vgl image. It is exactly what I need.

I need the hardware acceleration so used the related command from the dockerhub page (to start the container and share the X11 socket of the host):

xhost +local:$(whoami)

docker run -it -P --rm \
    --device /dev/dri/card0 \
    -v /tmp/.X11-unix:/tmp/.X11-unix:rw \
    accetto/ubuntu-vnc-xfce-opengl-g3:vnc-novnc-mesa-vgl

xhost -local:$(whoami)

However, the container does not start if I pass the " -v /tmp/.X11-unix:/tmp/.X11-unix:rw \" parameter.

-The docker logs command returns empty. -I have an Nvidia card with the appropriate driver installed on the host:

nvidia-smi

Fri Feb 18 19:10:34 2022       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 510.47.03    Driver Version: 510.47.03    CUDA Version: 11.6     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ...  Off  | 00000000:01:00.0 Off |                  N/A |
| N/A   48C    P0    22W /  N/A |    528MiB /  6144MiB |      3%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A      2552      G   /usr/lib/xorg/Xorg                359MiB |
|    0   N/A  N/A      2785      G   /usr/bin/gnome-shell               67MiB |
|    0   N/A  N/A      4506      G   ...AAAAAAAAA= --shared-files       39MiB |
|    0   N/A  N/A      4769      G   ...AAAAAAAAA= --shared-files       34MiB |
|    0   N/A  N/A     17605      G   ...AAAAAAAAA= --shared-files       21MiB |
|    0   N/A  N/A     41321      G   /usr/lib/firefox/firefox            1MiB |
+-----------------------------------------------------------------------------+

-I'm using Ubuntu 20 LTS with gnome (gdm 3). -Docker version:20.10.12, build e91ed57 -The path: /tmp/.X11-unix:/tmp/.X11-unix exist it contains 1 file: srwxrwxrwx 1 krisztian krisztian 0 Feb 18 11:25 X1= -Tried to execute the docker command with sudo. (same result) -echo $DISPLAY :1

-If I don't pass that parameter the container starts but it says:

 vglrun glmark2 
[VGL] ERROR: Could not open display :0.

Is there anything I can try/or need to check?

accetto commented 2 years ago

Hello,

thanks for the feedback. However, I'm not able to reproduce the issue.

I've used exactly the commands you've used and I have no problem. I can connect to the container via VNC viewer and also the command vglrun glmark2 works correctly.

I unfortunately do not have an NVidia card in my computer, so it's not the same environment, but I don't think, that the problem is there.

Where exactly have you executed your command echo $DISPLAY? On the host or inside the container?

If it was on the host, then it would mean, that you're already using the display :1.

Please try the command provided below. Note that I've removed the parameter -P and I've also addded some more parameters to make the debugging more convenient:

-e DISPLAY=:2 (or some other display number) should (could) solve the problem
-p "35000:5901" binds VNC always to the same port 35000
-p "35001:6901" binds noVNC always to the same port 35001
-e VNC_PW="" sets an empty password, so you don't need to type it
--tail-vnc allows to watch the VNC log in the runtime

docker run -it --rm \
    --device /dev/dri/card0 \
    -v /tmp/.X11-unix:/tmp/.X11-unix:rw \
    -e DISPLAY=:2 \
    -p "35000:5901" \
    -p "35001:6901" \
    -e VNC_PW="" \
    accetto/ubuntu-vnc-xfce-opengl-g3:vnc-novnc-mesa-vgl --tail-vnc

Please let me know, if it helped.

One more remark. I'm currently working on a new release, so expect some minor changes in image tagging. Particularly, the image tag vnc-novnc-mesa-vgl will be probably latest-mesa-vgl soon. It will be similar to the project accetto/ubuntu-vnc-xfce-g3 I've released yesterday.

Regards, accetto

uflik commented 2 years ago

Thank you very much for the quick answer. The container now starts with the volume. You were right the DISPLAY :1 was already in use on the host. (I'm not so experienced when it comes to X/GPU topic so didn't thought that could be a problem. )

However, now I still got the same when I'm trying to run the "vglrun glmark2" command: [VGL] ERROR: Could not open display :0.

The container logs:

Tailing VNC log '/dockerstartup/vnc.log'
Using desktop session xfce
xauth:  file /home/headless/.Xauthority does not exist

New '12f020ac6393:2 ()' desktop is 12f020ac6393:2

Starting desktop session xfce
xinit /etc/X11/Xsession startxfce4 -- /usr/bin/Xvnc :2 -depth 24 -geometry 1360x768 -rfbport 5901 -auth /home/headless/.Xauthority -desktop 12f020ac6393:2 () -pn -rfbauth /home/headless/.vnc/passwd -rfbwait 30000

Xvnc TigerVNC 1.11.0 - built Sep  8 2020 12:27:03
Copyright (C) 1999-2020 TigerVNC Team and many others (see README.rst)
See https://www.tigervnc.org for information on TigerVNC.
Underlying X server release 12001000, The X.Org Foundation

Sun Feb 20 14:40:04 2022
 vncext:      VNC extension running!
 vncext:      Listening for VNC connections on all interface(s), port 5901
 vncext:      created VNC server for screen 0
xinit: XFree86_VT property unexpectedly has 0 items instead of 1

Sun Feb 20 14:45:35 2022
 Connections: accepted: 127.0.0.1::37902
 SConnection: Client needs protocol version 3.8
 SConnection: Client requests security type VncAuth(2)

Sun Feb 20 14:45:39 2022
 VNCSConnST:  Server default pixel format depth 24 (32bpp) little-endian rgb888
 VNCSConnST:  Client pixel format depth 24 (32bpp) little-endian rgb888

So I thought I should use the command like this: vglrun -d :1 glmark2 (maybe I wrong but if I understand correctly this command tells the vgl to use my hosts DISPLAY :1 for rendering) But it also ended like this:

libGL error: No matching fbConfigs or visuals found
libGL error: failed to load driver: swrast
Error: glXCreateNewContext failed
Error: CanvasGeneric: Invalid EGL state
Error: main: Could not initialize canvas

After reading about this issue a bit it looks like the reason is that is loading mesa libraries and not the nvidia ones. So I installed the nvidia driver:

sudo apt-get update
sudo apt install nvidia-driver-510

After that the error of vglrun -d :1 glmark2:

X Error of failed request:  BadValue (integer parameter out of range for operation)
  Major opcode of failed request:  152 (GLX)
  Minor opcode of failed request:  24 (X_GLXCreateNewContext)
  Value in failed request:  0x0
  Serial number of failed request:  30
  Current serial number in output stream:  31

Will try to investigate that issue further and also trying it out on AWS. (as at the end I want to run this on EC2 with some NVidia GPU instance)

Will post here if I found out something. However, it looks like the nvidia card makes the hardware acceleration more difficult to setup.

accetto commented 2 years ago

@uflik Any success?

accetto commented 2 years ago

This seems to be stalled. Closing it.

accetto / headless-drawing-g3

Container doesn't start with volume .X11-unix:rw #3