NixOS / nixpkgs

Nix Packages collection & NixOS
MIT License
17.28k stars 13.53k forks source link

libGL not working on non-NixOS (without setting up) #9415

Open joepie91 opened 9 years ago

joepie91 commented 9 years ago

Installation works, running it does not.

sven@linux-etoq:~> nix-env -i openarena
installing ‘openarena-0.8.8’
these paths will be fetched (390.19 MiB download, 426.80 MiB unpacked):
  /nix/store/q8smn7k0y349ymfc9mdnja0wa6s79njv-openarena-0.8.8
fetching path ‘/nix/store/q8smn7k0y349ymfc9mdnja0wa6s79njv-openarena-0.8.8’...

*** Downloading ‘https://cache.nixos.org/nar/194liswh0vgrj02q7q7ww8igizn45bri6ip4m5wxa47s3ghj65q7.nar.xz’ to ‘/nix/store/q8smn7k0y349ymfc9mdnja0wa6s79njv-openarena-0.8.8’...
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  390M  100  390M    0     0  7996k      0  0:00:49  0:00:49 --:--:-- 9247k

building path(s) ‘/nix/store/j0lc0hns2kna9sw1g3k46zbazxhsxha0-user-environment’
sven@linux-etoq:~> openarena 
/nix/store/q8smn7k0y349ymfc9mdnja0wa6s79njv-openarena-0.8.8/openarena-0.8.8/openarena.x86_64: error while loading shared libraries: libGL.so.1: cannot open shared object file: No such file or directory
nh2 commented 7 years ago

@expipiplus1 Note you can use ldconfig to get the path of the nvidia driver, e.g.

$ ldconfig -f /etc/ld.so.conf -C /etc/ld.so.cache -p | grep libGL.so
    libGL.so.1 (libc6,x86-64) => /usr/lib/nvidia-375/libGL.so.1
    libGL.so.1 (libc6) => /usr/lib32/nvidia-375/libGL.so.1
    libGL.so (libc6,x86-64) => /usr/lib/x86_64-linux-gnu/libGL.so
    libGL.so (libc6,x86-64) => /usr/lib/nvidia-375/libGL.so
    libGL.so (libc6) => /usr/lib32/nvidia-375/libGL.so
guibou commented 7 years ago

Do you know something I can do, as an application developer, to ensure my application works on non-NixOS? I cannot ask my client to manually patch their system, however I can do anything ugly in my code to make it work.

Do you think there is something to do with libglvnd? As far as I understood it, a package can be build with libglvnd as a pure dependency and only depend on it. At runtime, libglvnd will search for a suitable driver (I don't know how).

(This issue is really a pain, this is the only reason we still use our ten year old ubuntu machine to build ;)

nh2 commented 7 years ago

@guibou I don't know libglvnd, but you can certainly change your application (technically probably a wrapper script you have around it) to call ldconfig at startup-time, and parse its output to automatically determine which libraries you should LD_PRELOAD.

If you do that, please share your work here!

donbright commented 7 years ago

the whole '/run' thing invalidates the idea that "to uninstall nix, just rm -rf /nix"

and i really just need a software GL driver working... im running on VMs. seems like this should be do-able without interacting at all with system libraries. I tried export LIBGL_ALWAYS_SOFTWARE=1 and it didn't fix anything

vcunat commented 7 years ago

Different hardware is best used with different drivers, especially some VMs have their own libGL implementations that utilizes host's acceleration IIRC. (I'd personally use a NixOS VM, but you probably have reasons not to.)

donbright commented 7 years ago

@vcunat GL is not just used for interactive graphics, it is used to generate image files and also for regression testing of programs. VMs are often used for testing, and the hardware GL driver is irrelevant in such cases because you want to test the program itself without dealing with hardware, using software-only-rendering for GL should not involve any hardware-specific gpu drivers of any kind on any system. This is also good for virtual headless X systems like Xvfb or Xvnc when run on remote servers.

(deleted my prev comment about mmap EPERM when copying files to /run, dont copy files to /run)


update i just noticed that

/run requires root access

/run is not persistent across reboots https://wiki.debian.org/ReleaseGoals/RunDirectory

dezgeg commented 7 years ago

Setting:

LIBGL_DRIVERS_PATH=$(nix-build --no-out-link -A mesa_drivers)/lib/dri
LD_LIBRARY_PATH=$(nix-build --no-out-link -A mesa_noglu)/lib

should work (assuming no proprietary GPU drivers) without touching /run.

guibou commented 7 years ago

I'm trying to understand the issue. OpenGL is, by definition, impure. You need to link with the system provided libGL.so to get something working with hardware acceleration.

We usually link with the libGL provided in nixpkgs in the package mesa-noglu, and that one fails at loading the necessary shared libraries. Most of the "hackish" solutions involves copying the shared libraries in /run/... plus changing the runpath of them and all of their recursive dependencies. This is complicated, system dependent and highly intrusive.

Another solution proposed in nix-install-vendor-gl.sh is to install in the nix store the same driver and library as those provided by the system. This seems a good idea, however it is a pain to work for any system, any driver version, ... As far as I know, it only support nvidia main line of driver, on ubuntu.

However, it is possible to add to our executable a runpath (in last position) which is the one of the host system (usually /usr/lib). This way, the program will find the system libGL. However it stills fail to link because the system libGL have an empty runpath: it depends on the system dynamic loader builtin list. The nix dynamic loader does not comes with this list.

What will happen if we ask the nix dynamic loader to respect the system provided library configuration AFTER the runpath provided inside the executable / shared libraries? I have the feeling that it will not break anything nix related, because a correctly built nix package should find its dynamic library in the directory provided as runpath, however it may fix most of the issues with OpenGL, and, I hope, may also work with weird configurations such as primus.

Did I miss something?

Side note: the case where you are sure you want the software mesa GL can be handled differently, by, for example, providing a "mesa-pure" package.

guibou commented 6 years ago

I just experimented with that https://github.com/guibou/nixGL solution.

It works out of the box for me, and I find the idea elegant (disclaimer: that's my idea ;) and it should be really easy to adapt to any system. It is just a wrapper script, and it does not change anything on the host system. The wrapper script can be installed with nix.

This solves most of my issues with OpenGL on non nix system (with the drawback of needing to call the binary through a wrapper script).

vcunat commented 6 years ago

That could work well enough. I believe it might do a bit unexpected things when running sub-processes, due to the fact that LD_LIBRARY_PATH is just inherited. Also running non-ELF executables probably won't work with this script, e.g. those produced via makeWrapper.

guibou commented 6 years ago

@vcunat I'm now convinced that the only solution to fix this issue permanently and in robust way is a patch in the dynamic-loader. When the dynamic loader looks for libGL.so, instead of following the rpath of the current executable, it will follows the rules of the current operating system (using their dynamic loader settings) and will do this recursively.

bjornfor commented 6 years ago

@guibou: Interesting!

vcunat commented 6 years ago

There's a more difficult problem in there, similarly to NixOS – various dependencies of libGL. Unless we want some real black magic, you will only have one version of (say) libc loaded at once in a process – either ours or the OS one has to win. Well, libc itself is probably the less risky one in this, but some implementations depend on more fragile libs (we had this libwayland issue).

dezgeg commented 6 years ago

Yes. Hopefully SomeoneElse(tm) gets the black magic working: https://sourceware.org/ml/libc-help/2017-05/msg00016.html

twhitehead commented 6 years ago

I didn't have any luck trying to get my Debian MESA OpenGL library to work with Nix executables. What did work though was just using the Nix MESA OpenGL with the Nix executables.

sudo bash -c "$(declare -p); nix-build -A 'mesa_noglu.drivers' $HOME/.nix-defexpr/channels/nixpkgs -o /run/opengl; mv -T /run/opengl-driver{s,}"

This just creates a symlink from /run/opengl-driver to the drivers output of the Nix mesa_noglu package and registers it with the garbage collector. nix-build suffixes the output with -drivers, so we have to change that to -driver (this was missing as noted by @woffs later in the first version)..

The $(declare -p) bit imports the users environment into the superuser's environment. Useful as likely Nix is installed in single-user mode and the superuser won't be setup correctly to use it.

deepfire commented 6 years ago

@twhitehead, cool! What GPU stack do you have, Intel?

woffs commented 6 years ago

Great, works, after doing ln -s opengl-drivers /run/opengl-driver :-)

twhitehead commented 6 years ago

@deepfire AMD. It's an Asus Radeon R9 270 IIRC. Bought it a couple of years ago specifically so I didn't have to mess around with any closed source pain and the large slow fans made it super quiet.

dezgeg commented 6 years ago

I have been trying to get libcapsule working: https://github.com/dezgeg/libcapsule https://github.com/dezgeg/nixpkgs/tree/libcapsule. So far it compiles a libGL stub successfully but glxinfo crashes somewhere in the dynamic linker during the very first call it does (glXChooseVisual)...

#0  0x00007ffff7debea0 in add_to_global () from /nix/store/yydnhs7migvlbl48wpsxan1yvq2icbr9-glibc-2.25-49/lib/ld-linux-x86-64.so.2
#1  0x00007ffff7decec4 in dl_open_worker () from /nix/store/yydnhs7migvlbl48wpsxan1yvq2icbr9-glibc-2.25-49/lib/ld-linux-x86-64.so.2
#2  0x00007ffff7598001 in _dl_catch_error () from /nix/store/yydnhs7migvlbl48wpsxan1yvq2icbr9-glibc-2.25-49/lib/libc.so.6
#3  0x00007ffff7dec107 in _dl_open () from /nix/store/yydnhs7migvlbl48wpsxan1yvq2icbr9-glibc-2.25-49/lib/ld-linux-x86-64.so.2
#4  0x00007ffff4fb8f66 in dlopen_doit ()
#5  0x00007ffff5d08001 in _dl_catch_error ()
#6  0x00007ffff4fb9569 in _dlerror_run ()
#7  0x00007ffff4fb8ff1 in dlopen@@GLIBC_2.2.5 ()
#8  0x00007ffff2e68778 in driOpenDriver ()
#9  0x00007ffff2e6f6bb in dri3_create_screen ()
#10 0x00007ffff2e43709 in __glXInitialize ()
#11 0x00007ffff2e3f91b in GetGLXPrivScreenConfig.part.2 ()
#12 0x00007ffff2e3fa83 in glXChooseVisual ()
#13 0x0000000000404148 in mesa_hack (dpy=0x7ffff7eed010, scrnum=0) at glxinfo.c:1168
#14 0x0000000000404417 in main (argc=1, argv=0x7fffffffb2e8) at glxinfo.c:1257
vcunat commented 6 years ago

Sounds really interesting/promising. Perhaps we should try asking the author.

donbright commented 6 years ago

inspired by all the above..... i was experimenting with a kludge for OpenSCAD that uses some.. uhm.. .unusual facts about dynamic linkers and directory searching of dlopen() on linux and patchelf. I have tested about a dozen different linux distros on several different x86 machines, including amd hardware, nvidia hardware, intel 915 and 965 hardware, with open source drivers, and nvidia's closed source driver. and the software driver.

https://github.com/openscad/openscad/tree/nixbuild

The basic theory is that you isolate the various dependency library .so files into multiple subdirectories, and mess around with the RPATHs and other unusual features of ELF on linux. In this way you can manipulate not only your main program into loading a specific libGL.so, but you can also manipulate libGL.so into loading specific driver .so files, and also manipulate those driver .so files into loading specific versions of libraries that they need. So you can have two different versions of libwhatever.so, in the overall dependency tree of your main program, and it turns out to be OK.

The duplicate .so files aren't technically linked "directly" into your program... they are loaded individually by the shared object loader at runtime - combining this fact with the hierarchically isolated way that ELF dynamic object files, including libGL.so, load and calls its dependencies at runtime, and our RPATH manipulation, it means that you can have somefunction() inside libwhatever.so that is from Nvidia's proprietary bundle of files, then also have somefunction() inside libwhatever.so from the system and they wont crash or smash each other. This also works with stuff like QT which dynamically loads libGL.so.

The basic theory is here in a tiny program that i used to test this hypothesis:

https://github.com/donbright/mrpblt

I am like 90% sure this is correct? It seemed to work with OpenSCAD ok on, as i said, about a dozen different linux versions, on four machines, and I was deep into testing but it kind of overwhelmed me, the complexity of figuring out the simplest way to organize the build process to where it would work both with Nvidia proprietary drivers and open source drivers. It winds up copying a huge tree of .so files into a subdirectory of the build, inspired by how the __qt5_nix__ subdirectory magically comes into existence for qmake qt5 programs under nix shell. It also requires modifying the generated binary be modified so it only looks under those subdirs for certain things.

Also reading about why Nvidia has proprietary drivers, and how the PC graphics card industry works, was kind of depressing. The amount of time any graphics project spends working around OpenGL issues might be better spent preparing for Vulkan future: https://www.gamedev.net/forums/topic/666419-what-are-your-opinions-on-dx12vulkanmantle/?tab=comments#comment-5215019

guibou commented 6 years ago

Hello.

I did an update of nixGL: https://github.com/guibou/nixGL since I have now a better understanding of the problem. The previous limitations I had with my previous approach are now removed.

deepfire commented 6 years ago

@guibou, how does nixGLNvidia address the case when the Nixpkgs-provided nVidia client-side component version doesn't match the nVidia host side? In my experience it's often a crash, which is why nix-install-vendor-gl goes to the pains of version detection etc..

guibou commented 6 years ago

@deepfire as you said it may (will) crash. My solution is now really close to yours. I wanted to contribute in yours instead but I was blocked by my poor shell skills ;)

(Edit: nixGL can now use a specific version, manually specified by the user during nix-build)

cx405 commented 6 years ago

@guibou Hi! I have AMD hardware (R9 280 and also mobility hd4650) and willing to test any task to help out with the project. I am a bit of new though, so sorry upfront, if I'll misunderstand anything.

I noticed that both urban-terror and openarena (nix packages) do not run on nix, causing screen to go blank, and I also have some proprietary titles for what's worth.

guibou commented 6 years ago

@cx405 please open a bugrequest on the nixGL page with some details about your amd driver. I'll contact you directly for tests. (I don't have this kind of hardware, so it will be a kind of ping pong between you and me ;)

cx405 commented 6 years ago

@guibou aye! / edit: done :) // edit: I am currently migrating to nix 18.03 (and have some problems in channel today). I have old 17.09 generation ready to boot into, just in case. /// edit: I migrated properly to 18.03 and will update the cfg in nixGL issue.

cx405 commented 6 years ago

Wow, I found the reason of my problems with OpenGL in nixos on AMD/Ati cards! It looks like "radeon" does not suffice in nixos - one needs to specify also "ati" and "vesa": #37673

pmiddend commented 5 years ago

So still the only working solution is some external tool which needs the current nvidia driver version?

wedens commented 5 years ago

@pmiddend Yeah, https://github.com/guibou/nixGL is still the best solution/workaround we have, unfortunately.

edahlgren commented 5 years ago

I had a lot of trouble getting https://github.com/guibou/nixGL to work on my laptop. That's probably because I don't understand my particular cocktail of graphic drivers very well, but then again I don't think most people do. So I spent a couple days coming up with a simpler solution. Because I don't have other hardware to test on, I'd love to know if this works or doesn't work for anyone else.


tl;dr Use libGL indirect rendering through GLX. For why, see note [1] below.

Step 1: Enable indirect glx on an X server not installed by nix (e.g. for Linux, the one installed by default on your host) by running it with the command line option +iglx.

Step 2: Modern libGL binaries that don't find a graphics driver for direct rendering (e.g. XXX_nvidia.so.0) look for libGLX_indirect.so.0. Make a libGLX_indirect.so.0 symlink that points to libGLX_mesa.so.0 [2]. Then set LD_LIBRARY_PATH to include your libGLX_indirect.so.0 so your libGL dependent binary finds it:

$ nix-shell -p mesa glxinfo

[nix-shell:~]$ find /nix/store -name libGLX_mesa.so.0
/nix/store/bmydwxd7b21315n61fzfzzar6vsafjwn-mesa-noglu-18.3.4-drivers/lib/libGLX_mesa.so.0

[nix-shell:~]$ mkdir libindirect (could be named anything)
[nix-shell:~]$ ln -s /nix/store/bmydwxd7b21315n61fzfzzar6vsafjwn-mesa-noglu-18.3.4-drivers/lib/libGLX_mesa.so.0 libindirect/libGLX_indirect.so.0
[nix-shell:~]$ ls -l libindirect/libGLX_indirect.so.0
libindirect/libGLX_indirect.so.0 -> /nix/store/bmydwxd7b21315n61fzfzzar6vsafjwn-mesa-noglu-18.3.4-drivers/lib/libGLX_mesa.so.0

[nix-shell:~]$ export LD_LIBRARY_PATH=$(pwd)/libindirect:$LD_LIBRARY_PATH

Step 3: Then run an application linked with nix libGL:

[nix-shell:~]$ glxinfo | head
libGL error: unable to load driver: swrast_dri.so
libGL error: failed to load driver: swrast
name of display: :0
display: :0  screen: 0
direct rendering: No (LIBGL_ALWAYS_INDIRECT set)
server glx vendor string: NVIDIA Corporation
server glx version string: 1.4
server glx extensions:
    GLX_ARB_context_flush_control, GLX_ARB_create_context, 
    GLX_ARB_create_context_profile, GLX_ARB_create_context_robustness, 
    GLX_ARB_fbconfig_float, GLX_ARB_multisample, GLX_EXT_buffer_age, 
    GLX_EXT_create_context_es2_profile, GLX_EXT_create_context_es_profile, 

Step 4: (optional) To get rid of "libGL error: failed to load driver: swrast" related errors:

[nix-shell:~]$ export LIBGL_ALWAYS_INDIRECT=1

[nix-shell:~]$ glxinfo | head
name of display: :0
display: :0  screen: 0
direct rendering: No (LIBGL_ALWAYS_INDIRECT set)
server glx vendor string: NVIDIA Corporation
server glx version string: 1.4
server glx extensions:
    GLX_ARB_context_flush_control, GLX_ARB_create_context, 
    GLX_ARB_create_context_profile, GLX_ARB_create_context_robustness, 
    GLX_ARB_fbconfig_float, GLX_ARB_multisample, GLX_EXT_buffer_age, 
    GLX_EXT_create_context_es2_profile, GLX_EXT_create_context_es_profile, 

Notice "direct rendering: No" in the output of glxinfo. That means that you're using indirect gl rendering contexts. If you install glxinfo from your non-nix package manager (e.g. apt, yum) and run it directly on your host, you should see that change to "direct rendering: Yes".


[1] This approach is based on what you'd do if you wanted HW acceleration from a GPU on a remote machine (send rendering requests over the network to the machine with the GPU you want to use). Except in this case you don't go over the network, you just redirect to localhost. If you think about the nix store like a root filesystem on a machine without a GPU --- lacking host-specific graphics drivers, etc --- then I think this makes a lot of sense. Basically it's one level of indirection through an X server to deal with variation in host graphics drivers.

[2] I'll be honest, I'm not a graphics person. So I don't fully understand how generic libGLX_mesa.so.0 is. If it is acting as a generic X client, then I think this solution is host independent. If libGLX_mesa.so.0 is doing something else (and I'm just getting lucky with my setup), then this probably won't work for everyone. Please give it a try and let me know if it works for you --- or provide more info on the limitations of mesa.

[3] Performance: Any sort of indirection has an overhead, so this won't perform the same as direct rendering. There was no noticeable lag with a simple program like glxgears, nor with QtCreator and a simple Qt program. I didn't test games, but if using a local pipe is too slow (I believe that's the default), there's a shared memory extension you can add to most X servers.

[4] Is this stable? Wayland claims to not support indirect contexts, but for backwards compatibility Wayland has XWayland which does. And generally speaking, indirect rendering is such a useful feature that I don't see it disappearing any time soon.

@guibou, @vcunat, @joepie91, @deepfire, @ttuegel --- I'd be interested to know how you feel about this workaround and if it could be made more generic (minus the host X server configuration of course).

guibou commented 5 years ago

@edahlgren Could you also open an issue on nixGL tracker with details about your configuration?

edahlgren commented 5 years ago

@guibou No problem! See this, which also appears to be linked just above.

schell commented 5 years ago

I think I'm running into this issue as well here.

schell commented 5 years ago

Forgive me, but the solution to this problem is a bit hard to parse. I'd like to use nix to build an SDL based game that I can distribute to friends who are running Linux. Will the solutions here work for that case or is this simply the case of running locally for myself?

edahlgren commented 5 years ago

@schell You might have a look at nix-bundle (nix -> AppImage), but it's not a complete solution for applications that need hardware rendering through a system libGL (links to this issue). You could use software rendering for libGL, but that might not be fast enough, I don't know. Otherwise the solutions above I believe are for nix users.

@guibou @vcunat IIUC, nixGL will force applications to depend on specific graphics drivers and libGL libraries (that happen to match system requirements), which probably depend on their own version of libc. The application might be packaged with a very specific version of libc. Which one wins? The libc needed by the graphics driver? If so, is that safe (application developers aren't expecting a change)? Or perhaps it is unwise for graphics applications to pin to a specific version of libc in the first place (nix seems to promote pinning), given expected libGL system variation for desktop apps?

schell commented 5 years ago

Thanks @edahlgren - for now I'll continue building with stack for desktop and use nix to wrangle ghcjs.

vcunat commented 5 years ago

Pulling other libs via the C linker does cause issues sometimes, so it really helps to minimize the set; glibc libs actually provide very high level of compatibility (at least in the direction of increasing the version between build-time and run-time), but some libGL pull even libstdc++ or libwayland IIRC and these do cause problems occasionally if the versions differ significantly (they tend to have good ABI versioning, so the error should be loud and clear at least).

I'm afraid my GL knowledge is very low; I don't really have an idea about difference between (in)direct variants. I started my nixpkgs GL involvement in a time when there were only 5-10 active contributors in whole nixpkgs.

guibou commented 5 years ago

@edahlgren The idea with nixGL is to install a nixGL which correctly fits with your system. All the userspace driver will be provided by nix.

Unfortunately, for now, nixGL does not do any detection of the runtime setup and you need to setup it by yourself.

twhitehead commented 5 years ago

@edahlgren just wondering about the indirect thing as have recently spent a lot of time on OpenGL issues for our system (Compute Canada). Are you using X11 forwarding? I ask as I understand it tries to load libglx_$vendor where $vendor is the value reported in the GLX_EXT_libglvnd extension.

In our testing we noticed that enabling the indirect rendering (client sending the GL calls to the X11 sever for rendering) was something required by users using the Mac X11 server (and I expect Windows MobaXterm user as well) because it didn't support the X11 visuals required for the mesa client to perform client side software rendering (simulate GL by rendering them in software to a 2D X11 visual).

This was evident by the client program (glxinfo, glxgears, etc.) printing out a message about unsupported visuals when running over X11 forwarding from a non-linux machine. All the linux hosts we connected from did not have this issue and did not report this message. I'm presuming recent X11 servers offer the visuals required for mesa to do it client side software rendering.

I mention this as I understand indirect rendering can involve a lot of round trips between the client and the X11 server depending on the application, which can really kill performance over higher latency connections.

The best solution we found for situations where you just need to handle X11 connections was to use a mesa compiled with glx=gallium-xlib. This gives pure client-side software rendering. It is not without its own pain though as it severely restricts what other options you can turn on when building mesa (e.g., no EGL support, no libvglvnd support, etc.).

Still doing some work on this (currently sidetracked no other duties), but here is the basic derivation so far

mesa_glxgallium = stdenv.mkDerivation rec {
  version = super.mesa_noglu.version;
  name = "mesa-glxgallium-${version}";
  src = super.mesa_noglu.src;

  outputs = [ "out" "dev" ];

  configureFlags = [
    "--with-gallium-drivers=swrast"
    "--with-platforms=x11"           # surfaceless would make sense, but egl requires dri
    "--disable-dri"
    "--enable-glx=gallium-xlib"      # gallium-xlib requires no dri
    "--enable-gallium-osmesa"
    "--enable-llvm"
    "--disable-egl"                  # egl requries dri for some reason
    "--disable-gbm"                  # gbm requires dri for some reason
    "--enable-llvm-shared-libs"
    "--disable-opencl"
  ];
  nativeBuildInputs = [
    autoreconfHook pkgconfig
    python2
  ];
  propagatedBuildInputs = [ ];
  buildInputs = [
    llvmPackages.llvm
    xorg.xorgproto xorg.libX11 xorg.libXext xorg.libxcb
    expat
  ];

  enableParallelBuilding = true;

  meta = with stdenv.lib; {
    description = "An open source implementation of OpenGL";
    homepage = https://www.mesa3d.org/;
    license = licenses.mit; # X11 variant, in most files
    platforms = platforms.linux;
  };
};
edahlgren commented 5 years ago

@twhitehead If you mean X11 forwarding in the sense of ssh -X, no. I'm just enabling +iglx on a local X server. In my setup, local X clients link to libGLX_indirect.so which happens to symlink to libGLX_mesa.so. These local X clients use libGLX_mesa.so to send GL requests to the X server using IPC (kernel pipe or shared memory). The X server links to host GL libs (e.g. nvidia). Doing things this way is a bit absurd given that it's all local, but the indirection acts as a boundary between nix applications and the system. In this setup, there isn't a high latency connection because there's no remote machine.

Regarding needing indirect rendering for pure software rendering to work on macOS --- that's good to know! I sent this email 2 days ago to the mesa users mailing list. It asks about building gallium on/for macOS and whether there's support for gallium on macOS without X, like how they support GDI for Windows. My goal here is to offer software rendering as a basic working option for nix users if configuring an X server or graphics drivers is too cumbersome. I'm trying to get hold of cheap, used macOS laptops or servers to test.

As for libglvnd, I ran into that too :frowning_face: On Linux at least, libglvnd seems to ask the X server which libGXL_vendor.so to load, like you say. So if you're like me and have nvidia drivers installed on your system, libglvnd will blindly try to use them, even if you want to use pure software rendering instead. To get around this, I built a gallium libGL.so from scratch and linked simple nix GUIs (e.g. qt apps, editors) to that. Here's my derivation, which looks similar to yours:

{ stdenv, fetchurl, pkgconfig
, scons, flex_2_5_35, bison2
, python27, python27Packages
, llvm_35, xorg, expat
}:

stdenv.mkDerivation rec {
  name    = "swr-${version}";
  version = "19.0.5";

  src = fetchurl {
    url    = "https://mesa.freedesktop.org/archive/mesa-${version}.tar.gz";
    sha256 = "0w2ff9gzahg4djmqmb903gm9bqqcs1x8piw7g0g5vhdy4f6bgrmn";
  };

  nativeBuildInputs = [
    pkgconfig scons flex_2_5_35 bison2
    python27 python27Packages.Mako
  ];

  buildInputs = [
    llvm_35 xorg.libX11 xorg.libXext
    xorg.libXdamage xorg.libXfixes expat
  ];

  phases = [ "unpackPhase" "buildPhase" "installPhase" ];

  buildPhase = ''
    scons build=release libgl-xlib
  '';

  installPhase = ''
    mkdir -p $out/lib
    cp build/linux-x86_64/gallium/targets/libgl-xlib/* $out/lib/
  '';
}

You use it like this (substitute qtcreator and qt5Full packages with your GUI app):

$ nix-shell -p swr qtcreator qt5Full
[nix-shell:~]$ export LD_LIBRARY_PATH=/nix/store/xhzgpc7rgj760vfxk5qad6iqaaw519qg-swr-19.0.5/lib
[nix-shell:~]$ qtcreator

I would have taken your approach (super.mesa_noglu) but I wanted to try building the recommended way with scons. See the gallium project page for more details. The derivation could probably be much improved by people with more experience than me.

Regarding this severely restricting other options you can build with (e.g. EGL, libvglvnd) maybe that's OK. One can have multiple libGL.so libraries in one's nix store for different needs. Some package could try to bridge them all together into the "one true libGL.so", but I find it important to have the option to just use one separately in a simple way (like software rendering only). As for libglvnd, my understanding after reading this thread is that it's more for handling optimus setups (e.g. intel + nvidia) than being a generic dispatching library (see comment 8). Perhaps someone can correct or validate that.


@guibou Cool, right. I guess I'm curious, is there still a chance that the drivers nixGL installs can somehow link poorly with applications pinned to specific versions of libstdc++, etc? I guess that'd be true if the drivers can't be compiled from source using the same standard libs and compiler as the application?


@vcunat Got it, thanks :smile: My goal here is to make it very easy for the academics I work with to use and build nix GUI apps so they can make their research results (which sometimes depend on GUIs) more reproducible and friendly to people creating derivative work. They rarely have the patience or knowledge to debug their development system like we're doing in this thread. They might eventually become NixOS users, but not at the beginning.


I've been thinking more about the problem in this thread in general, curious to hear more thoughts. It seems that nix, AppImage, docker, chroot, and all other things that try to statically define or contain applications suffer from the same problem. That there are systems libraries you link to (e.g. libGL.so) that you can't statically link to ahead of time because they depend on variable hardware.

Like others have said, in many cases the ABIs of these libraries and their dependencies are stable. But the set of possible libGL.so dependencies (e.g. libc, libz, libxcb, libstdc++) isn't well specified or stable as far as I know. So application developers seem to build with old versions of their compiler and pray. That seems somewhat contrary to nix: with nix, you're encouraged to be very specific about versions of libraries so your app or experiment is reproducible for others (a great thing). And in practice, libGL.so (and other systems libs) can make that impossible or fragile at best.

Instead, I think what I want is libGL.so and all other systems libraries I don't know about (but need in the same way I need the Linux kernel) to be built into a separate process that uses IPC to communicate with dependent apps (shared memory for IPC, ideally). That means there's no linking conflicts between system libs and apps. And if the ABI of libGL is stable, in this model that's the only ABI I need to care about. Indirect rendering through an X server is one way to do this. But it could probably be a lot simpler than that. The only problem is that I don't know how to achieve it yet.

guibou commented 5 years ago

@edahlgren

Cool, right. I guess I'm curious, is there still a chance that the drivers nixGL installs can somehow link poorly with applications pinned to specific versions of libstdc++, etc? I guess that'd be true if the drivers can't be compiled from source using the same standard libs and compiler as the application?

I'm not sure I understand your question. The driver that nixGL installs link properly with all the needed dependencies (including libc, libstdc++, ...) taken directly from nix. That's the point of nixGL. In this case, it will have the same libraries as the application.

edahlgren commented 5 years ago

@guibou Sorry if I wasn't clear. What I mean is that nixGL seems to download nvidia drivers if needed, and those are closed source IIUC. Those presumably have certain fixed dependencies. They might use versions of libc, libstdc++, etc taken from nix. But that doesn't prevent versioning conflicts with nix apps that depend on other versions of libc, libstdc++, etc taken from nix too, correct?

But maybe that's just a general problem of linking to closed source libraries, not specific to nix (though maybe more obvious with nix).

timjrd commented 5 years ago

Hi there! I packaged an OpenGL application with Nix. It's working well on NixOS, but now I would like to distribute it as an AppImage with nix-bundle. I successfully generated the image and I can run it on Ubuntu 18, but it rapidly fails with a GLX error from GLFW (same thing if I install Nix and import the closure). I tried all of the above hacks without any success. Is there any chance I can get it running outside of NixOS without root access (so I can easily distribute it)? Maybe I could bundle nixGL?

guibou commented 5 years ago

@timjrd you won't be able to bundle nixGL because for now, nixGL must know the targeted driver at buildtime.

They might use versions of libc, libstdc++, etc taken from nix. But that doesn't prevent versioning conflicts with nix apps that depend on other versions of libc, libstdc++, etc taken from nix too, correct?

That's why you MUST use nixgl with the same nixpkgs clone as the one used for your application. The nixGL derivation does have a pkgs attribute which MUST be set to the same nixpkgs clone as the one used by your application.

timjrd commented 5 years ago

OK, I eventually managed to get my AppImage "automatically" working on Ubuntu 18 using patchelf, LD_LIBRARY_PATH, and LIBGL_DRIVERS_PATH in a hacky wrapper script. It should be extendable to other distributions, although it's not very robust. This helper script was very useful to find missing dlopened libraries.

nh2 commented 4 years ago

For meshlab, nixGL continues to be a workaround: https://github.com/NixOS/nixpkgs/pull/70937#issuecomment-541036241

adieu commented 4 years ago

Hit this issue in crostini (linux environment in ChromeOS).

Fixed it with

sudo mkdir -p /run/opengl-driver/;sudo ln -s `nix eval --raw nixpkgs.mesa_drivers.outPath`/lib /run/opengl-driver/lib

Might be useful for other environments too.

guibou commented 4 years ago

@adieu nixGL automate this process and does not need sudo.

nixos-discourse commented 4 years ago

This issue has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/nixgl-needs-you-for-testing/6921/7