Closed henzler closed 6 years ago
Same issue here. Any followup?
For GL ... what GL libraries is the program linking to?
ldd /path/to/gl/binary
You may need to add some libraries into $sysconfdir/singularity/nvliblist.conf
As for the display ... You may need to add something like the following:
export SINGULARITYENV_DISPLAY=${DISPLAY}
Here is the result for ldd /usr/bin/glxgears
linux-vdso.so.1 => (0x00007ffd1ebe7000) libGL.so.1 => /.singularity.d/libs/libGL.so.1 (0x00007fb8d8d48000) libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007fb8d8a32000) libX11.so.6 => /usr/lib/x86_64-linux-gnu/libX11.so.6 (0x00007fb8d86f8000) libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fb8d832f000) libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007fb8d812a000) libGLX.so.0 => /.singularity.d/libs/libGLX.so.0 (0x00007fb8d7efa000) libGLdispatch.so.0 => /.singularity.d/libs/libGLdispatch.so.0 (0x00007fb8d7c2c000) /lib64/ld-linux-x86-64.so.2 (0x0000561e12fa6000) libxcb.so.1 => /usr/lib/x86_64-linux-gnu/libxcb.so.1 (0x00007fb8d7a09000) libXext.so.6 => /usr/lib/x86_64-linux-gnu/libXext.so.6 (0x00007fb8d77f7000) libXau.so.6 => /usr/lib/x86_64-linux-gnu/libXau.so.6 (0x00007fb8d75f3000) libXdmcp.so.6 => /usr/lib/x86_64-linux-gnu/libXdmcp.so.6 (0x00007fb8d73ec000)
I am not sure what should I add to the nvliblist.conf. Currently it has the following library
libcuda.so
libEGL_installertest.so
libEGL_nvidia.so
libEGL.so
libGLdispatch.so
libGLESv1_CM_nvidia.so
libGLESv1_CM.so
libGLESv2_nvidia.so
libGLESv2.so
libGL.so
libGLX_installertest.so
libGLX_nvidia.so
libglx.so
libGLX.so
libnvcuvid.so
libnvidia-cfg.so
libnvidia-compiler.so
libnvidia-eglcore.so
libnvidia-egl-wayland.so
libnvidia-encode.so
libnvidia-fatbinaryloader.so
libnvidia-fbc.so
libnvidia-glcore.so
libnvidia-glsi.so
libnvidia-gtk2.so
libnvidia-gtk3.so
libnvidia-ifr.so
libnvidia-ml.so
libnvidia-opencl.so
libnvidia-ptxjitcompiler.so
libnvidia-tls.so
libnvidia-wfb.so
libOpenCL.so
libOpenGL.so
libvdpau_nvidia.so
nvidia_drv.so
tls_test_.so
libGL.so.1 => /.singularity.d/libs/libGL.so.1 (0x00007fb8d8d48000)
libGLX.so.0 => /.singularity.d/libs/libGLX.so.0 (0x00007fb8d7efa000)
libGLdispatch.so.0 => /.singularity.d/libs/libGLdispatch.so.0 (0x00007fb8d7c2c000)
It looks like the main GL libs are pulling from those brought in by the --nv
option. The others are the X server libs, which are going to be container dependent.
Have you tried setting export SINGULARITYENV_DISPLAY=${DISPLAY}
? What that will do is set the DISPLAY
environment variable inside the container, with the value it is on the host.
I am not sure what you mean by ${DISPLAY}. This variable is an empty string for me... I tried directly set the SINGULARITYENV_DISPLAY with the command and it does not work
Umm... okay... DISPLAY
holds the display of the X server. For instance, on my laptop, the display is:
:0.0
It's layout is basically: [host]:[display][.screen]
. That is telling the X server where to send the graphical output. When you have a VirtualGL setup, etc... you'll end up with a DISPLAY
being like: :10.0
... 0.0
is generally the monitor hooked up to the graphics card ... even headless. It's the local display. :10.0
would be an offset display. Another user logged in at the same time could get :12.0
, and so on.
The other displays will generally proxy through the local display depending on your setup. You can configure multiple displays in the X configuration, but I highly doubt that is the case here, so generally how this works is you have the system automatically logging into the GUI. The default user is setup to allow connections to the local display, from the local machine.
Each user that spawns off a new session is then assigned a DISPLAY that they write to, but it's rendered on display :0.0
.... which is the local display / hardware.
Thanks for the explanation. I compile the code which is on top of this thread and get the executable. If I run the executable inside the container, I got an error, say that no device is detected and there is no egl display. However, if I run the executable outside of the container, it is able to detect 4 GPUs.
On both inside and outside the container, the DISPLAY is an empty string
Hrmm... Are you doing a EGL context as well (as from the original post)?
The only thing I can find for that, is nVidia says to link against libOpenGL.so
and libEGL.so
, but you have it linking against libGLX.so
for that context.
What is the full singularity
command you're using?
The singularity command I am using is just
singularity shell --nv chester/containers/ubuntu-16.04-lts-rl.img
When I print out the shared library used by the executable inside the container using
ldd ./test
, Here is what I got
linux-vdso.so.1 => (0x00007ffcfe9a8000) libEGL.so.1 => /.singularity.d/libs/libEGL.so.1 (0x00007fdc39b5d000) libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007fdc397ce000) libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fdc39405000) libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007fdc39201000) libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007fdc38ef7000) libGLdispatch.so.0 => /.singularity.d/libs/libGLdispatch.so.0 (0x00007fdc38c29000) /lib64/ld-linux-x86-64.so.2 (0x0000561d48a0c000) libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007fdc38a13000)
And if I do this outside of the container, here is what I got:
linux-vdso.so.1 => (0x00007fffccda3000) libEGL.so.1 => /lib64/libEGL.so.1 (0x00007f34ddf3c000) libstdc++.so.6 => /lib64/libstdc++.so.6 (0x00007f34ddc33000) libc.so.6 => /lib64/libc.so.6 (0x00007f34dd870000) libdl.so.2 => /lib64/libdl.so.2 (0x00007f34dd66c000) libm.so.6 => /lib64/libm.so.6 (0x00007f34dd369000) libGLdispatch.so.0 => /lib64/libGLdispatch.so.0 (0x00007f34dd09b000) /lib64/ld-linux-x86-64.so.2 (0x000055f0ce115000) libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007f34dce85000)
Outside the container, it works fine. To clarify, the source code used to compile the executable is the same as the original post.
I have compared the two libEGL.so.1 that is used and they are identical...
Okay, I built that and compiled it ... After an strace on the run, I needed to add in: -B /usr/share/glvnd
as an option to bind mount that directory in.
There's a file named: /usr/share/glvnd/egl_vendor.d/10_nvidia.json
that it tried opening. That config file doesn't exist in the container, and is based around what's installed on the host.
To verify what your binary is looking for run like:
strace ./egl 2>&1 | less
Then look for lines that contain egl_vendor.d
$ /usr/local/singularity/2.6.0/bin/singularity exec --nv -B /usr/share/glvnd cdash/ ~/tmp/egl
Detected 1 devices
EGL eglDpy: 0x6446f0
minor, major: 4, 1
EGL numConfigs: 1
EGL eglCfg: 0xcaf339
EGL Ctx: 0x649001
Note: The cdash/
sandbox is the only I had with X libraries... :/
Your solution solved my problem! And also thanks a lot for explaining how you find it!
I believe (I only tested with apptainer and on NixOS) the (nvliblist-based) --nv
still does not detect the host's glvnd configuration. Should we reopen the issue?
This git repository is closed. If you reproduce the problem with singularity-ce open a new issue at https://github.com/sylabs/singularity, otherwise open a new one at https://apptainer/apptainer.
Version of Singularity:
2.5.1-master.gd6e81547 also tested: 2.4
Problem
Hello, I am trying to create an OpenGL context via following code:
Detected 4 devices EGL eglDpy: 0x5607ef979890 minor, major: 4, 1 EGL numConfigs: 1 EGL eglCfg: 0xcaf353 EGL Ctx: 0x5607ef9893f1
Detected 0 devices EGL eglDpy: 0 minor, major: 32, 21907 EGL numConfigs: 0 EGL eglCfg: 0x10000ffff test: hello.cpp:80: int main(int, char**): Assertion `eglDpy != NULL' failed.
Steps to reproduce behavior
My singularity file is very simple and looks like this:
So basically I am not doing anything to the "guest" system with the Singularity file, I only create an ubuntu image and everything else should be the same?
EDIT
As soon as I go into a singularity container like this:
singularity exec --nv pytorch.simg bash
I get no results for:find /usr -type f -name "libGL*"
On my host system however I get:
find -type f | wc -l
HOST: 4026 GUEST: 1602