numtide / nix-gl-host

Run OpenGL/Cuda programs built with Nix, on all Linux distributions.
Apache License 2.0
79 stars 14 forks source link

What's the status of the project? #7

Open Fuuzetsu opened 1 year ago

Fuuzetsu commented 1 year ago

We're currently using nixGL but it's inconvenient for all the usual reasons, such as having to re-install closure when host version changes.

However we have some intel users (currently going via nixGL -> mesa) so nix-gl-host isn't usable as-is.

The README says:

⚠️ WARNING: we won't accept new driver contributions at this time.

The code needs to be cleaned up and rewritten before scaling to more drivers.

However there doesn't really seem to be any activity about this. Fine, it's FOSS and not like I'm paying.

I'm wondering if I should try to fork and support what I need (and not contribute back as it won't be accepted...) or try to perform whatever cleanup was deemed necessary and then extend on it or what.

picnoir commented 1 year ago

Hey!

The README says:

Some context about that: this project originally started to solve a particular issue a Numtide customer was facing. The work has been mostly exploratory, prototype-based. If you look at the git history, you'll see that I changed several times the DSO injecting approach to come up with the (semi-satisfactory) current solution.

What's inside of this repository is pretty-much a prototype. I think it's a somewhat decent approach to the problem. It does the job for Nvidia-based GPUs, but it's purely-technically pretty dirty. I originally added this warning sign because I wanted to rewrite this project to a runtime-less language and come up with a proper software architecture.

Most of the wrapper overhead is spent starting-up/tearing-down cpython. From the top of my head (don't quote me on that, it's been almost a year by now), the wrapper overhead was about 100ms on a hot cache. It's not a lot in the grand scheme of things, but it's a lot if you end up running the same wrapper several times to perform various checks when entering a shell.

So yeah, overall, we wanted to release this tool under a FOSS license, but we did not feel like it was good-enough quality-wise to be upstreamed to nix-community. Hence the warning sign. It's more of a "this is not the best we can do quality-wise" rather than a "this is not really open source".

However there doesn't really seem to be any activity about this

This is a nice segway to another point: I do not use this program at the moment. All the infra I currently manage (be it personal infra or associative stuff) is either running Guix or NixOS. This is not solving a problem I personally face. Hence, I have very little personal urge in pushing this forward.

The contract for which I originally wrote this software has been over for a while. This is the major limitation of contracting-based FOSS: once the contract is over, it's hard to maintain the software you wrote in its context.

All that to say that I don't see myself pushing this forward in a near future. Of course, this is unless somebody is willing to put money on the table and comission me (through Numtide) to push this to a production-ready state. hello@numtide.com *wink wink*.

I'm wondering if I should try to fork and support what I need (and not contribute back as it won't be accepted...) or try to perform whatever cleanup was deemed necessary and then extend on it or what.

Please do! And if you reach a state you're satisfied with, I think it'd make sense to move this to nix-community. All I'm asking is some sort of credit.

I likely won't be willing to review everything, but I can definitely have a chat with you/help you to direct you how to push that forward. I worked on this for a while, I gathered quite a lot of context around GLX/EGL.

I tried to document the current approach as best I can do in the internals.md document. As stated by the last section, I think the endgame approach would be some sort of DSO sandboxing through dlmopen (or libcapsule).

Before going into a implementation craze, it'd first prototype the Mesa support using the current codebase. Injecting host DSOs to the Nix closure is kind of crazy. It surprisingly works for Nvidia because all the shared dependencies between the GPU driver and the Nix closure [^1] have a pretty stable ABI. That might not be the case for Mesa. Better figure that out sooner rather than later :)

Looking at my laptop's (intel integrated graphics) Mesa dependencies for GLX and EGL, the list of "host" DSOs required is definitely longer than what's the Nvidia proprietary driver requires.

p~ » ldd /run/opengl-driver/lib/libGLX_mesa.so                                                                               ninjatrappeur@framework
        linux-vdso.so.1 (0x00007ffde271d000)
        libglapi.so.0 => /nix/store/ryxylchjvszqhjx97pzsbp8lkyd717ac-mesa-23.1.5/lib/libglapi.so.0 (0x00007fdfc859e000)
        libdrm.so.2 => /nix/store/nd1w61rrivwhb666fzz1mdp178dzbm57-libdrm-2.4.115/lib/libdrm.so.2 (0x00007fdfc8587000)
        libX11.so.6 => /nix/store/pj3w2hz3jii57q481z4drv2vybpsaxap-libX11-1.8.6/lib/libX11.so.6 (0x00007fdfc8442000)
        libxcb-glx.so.0 => /nix/store/gvmrpcdrm4mwpy6yyxsx122ynmjb1avn-libxcb-1.15/lib/libxcb-glx.so.0 (0x00007fdfc8422000)
        libxcb.so.1 => /nix/store/gvmrpcdrm4mwpy6yyxsx122ynmjb1avn-libxcb-1.15/lib/libxcb.so.1 (0x00007fdfc83f7000)
        libX11-xcb.so.1 => /nix/store/pj3w2hz3jii57q481z4drv2vybpsaxap-libX11-1.8.6/lib/libX11-xcb.so.1 (0x00007fdfc83f2000)
        libxcb-dri2.so.0 => /nix/store/gvmrpcdrm4mwpy6yyxsx122ynmjb1avn-libxcb-1.15/lib/libxcb-dri2.so.0 (0x00007fdfc83eb000)
        libXext.so.6 => /nix/store/1wav5f555lnhksrnmzl07ikf7jayd3g0-libXext-1.3.5/lib/libXext.so.6 (0x00007fdfc83d6000)
        libXfixes.so.3 => /nix/store/11br2anikdav0q4qpkw2fy6z71gr5crm-libXfixes-6.0.1/lib/libXfixes.so.3 (0x00007fdfc83cc000)
        libXxf86vm.so.1 => /nix/store/rfs2248hhkvfp1kilf5nj0sxrvq34rc4-libXxf86vm-1.1.5/lib/libXxf86vm.so.1 (0x00007fdfc83c4000)
        libxcb-shm.so.0 => /nix/store/gvmrpcdrm4mwpy6yyxsx122ynmjb1avn-libxcb-1.15/lib/libxcb-shm.so.0 (0x00007fdfc83bf000)
        libexpat.so.1 => /nix/store/ms04w36fj5v565h4sr1giig9ilkhmx4z-expat-2.5.0/lib/libexpat.so.1 (0x00007fdfc8394000)
        libxshmfence.so.1 => /nix/store/h26hqp38qv41rkb640hg3zn2hr06kw4w-libxshmfence-1.3.2/lib/libxshmfence.so.1 (0x00007fdfc838f000)
        libxcb-randr.so.0 => /nix/store/gvmrpcdrm4mwpy6yyxsx122ynmjb1avn-libxcb-1.15/lib/libxcb-randr.so.0 (0x00007fdfc837b000)
        libxcb-dri3.so.0 => /nix/store/gvmrpcdrm4mwpy6yyxsx122ynmjb1avn-libxcb-1.15/lib/libxcb-dri3.so.0 (0x00007fdfc8375000)
        libxcb-present.so.0 => /nix/store/gvmrpcdrm4mwpy6yyxsx122ynmjb1avn-libxcb-1.15/lib/libxcb-present.so.0 (0x00007fdfc8370000)
        libxcb-sync.so.1 => /nix/store/gvmrpcdrm4mwpy6yyxsx122ynmjb1avn-libxcb-1.15/lib/libxcb-sync.so.1 (0x00007fdfc8368000)
        libxcb-xfixes.so.0 => /nix/store/gvmrpcdrm4mwpy6yyxsx122ynmjb1avn-libxcb-1.15/lib/libxcb-xfixes.so.0 (0x00007fdfc835e000)
        libm.so.6 => /nix/store/9la894yvmmksqlapd4v16wvxpaw3rg70-glibc-2.37-8/lib/libm.so.6 (0x00007fdfc827c000)
        libc.so.6 => /nix/store/9la894yvmmksqlapd4v16wvxpaw3rg70-glibc-2.37-8/lib/libc.so.6 (0x00007fdfc8096000)
        libpthread.so.0 => /nix/store/9la894yvmmksqlapd4v16wvxpaw3rg70-glibc-2.37-8/lib/libpthread.so.0 (0x00007fdfc8091000)
        libXau.so.6 => /nix/store/d2jlkszkl2jwiqvkp9647arfd9r0ma1h-libXau-1.0.11/lib/libXau.so.6 (0x00007fdfc808c000)
        libXdmcp.so.6 => /nix/store/fk3m0m22alz25vr7cmpdfzwldqg1pf8l-libXdmcp-1.1.4/lib/libXdmcp.so.6 (0x00007fdfc8084000)
        /nix/store/9la894yvmmksqlapd4v16wvxpaw3rg70-glibc-2.37-8/lib64/ld-linux-x86-64.so.2 (0x00007fdfc8657000)
----------------------------------------------------------------------------------------------------------------------------------------------------
~ » ldd /run/opengl-driver/lib/libEGL_mesa.so                                                                               ninjatrappeur@framework
        linux-vdso.so.1 (0x00007fff3fb01000)
        libgbm.so.1 => /nix/store/ryxylchjvszqhjx97pzsbp8lkyd717ac-mesa-23.1.5/lib/libgbm.so.1 (0x00007fb537be2000)
        libglapi.so.0 => /nix/store/ryxylchjvszqhjx97pzsbp8lkyd717ac-mesa-23.1.5/lib/libglapi.so.0 (0x00007fb537ba8000)
        libexpat.so.1 => /nix/store/ms04w36fj5v565h4sr1giig9ilkhmx4z-expat-2.5.0/lib/libexpat.so.1 (0x00007fb537b7d000)
        libX11-xcb.so.1 => /nix/store/pj3w2hz3jii57q481z4drv2vybpsaxap-libX11-1.8.6/lib/libX11-xcb.so.1 (0x00007fb537b76000)
        libxcb.so.1 => /nix/store/gvmrpcdrm4mwpy6yyxsx122ynmjb1avn-libxcb-1.15/lib/libxcb.so.1 (0x00007fb537b4b000)
        libxcb-dri2.so.0 => /nix/store/gvmrpcdrm4mwpy6yyxsx122ynmjb1avn-libxcb-1.15/lib/libxcb-dri2.so.0 (0x00007fb537b44000)
        libxcb-randr.so.0 => /nix/store/gvmrpcdrm4mwpy6yyxsx122ynmjb1avn-libxcb-1.15/lib/libxcb-randr.so.0 (0x00007fb537b32000)
        libxcb-xfixes.so.0 => /nix/store/gvmrpcdrm4mwpy6yyxsx122ynmjb1avn-libxcb-1.15/lib/libxcb-xfixes.so.0 (0x00007fb537b28000)
        libdrm.so.2 => /nix/store/nd1w61rrivwhb666fzz1mdp178dzbm57-libdrm-2.4.115/lib/libdrm.so.2 (0x00007fb537b11000)
        libwayland-client.so.0 => /nix/store/5krz3xqcjgxgr7x8pd1z5xms9dkc0wjh-wayland-1.22.0/lib/libwayland-client.so.0 (0x00007fb537afd000)
        libwayland-server.so.0 => /nix/store/5krz3xqcjgxgr7x8pd1z5xms9dkc0wjh-wayland-1.22.0/lib/libwayland-server.so.0 (0x00007fb537ae7000)
        libxcb-dri3.so.0 => /nix/store/gvmrpcdrm4mwpy6yyxsx122ynmjb1avn-libxcb-1.15/lib/libxcb-dri3.so.0 (0x00007fb537ae1000)
        libxcb-present.so.0 => /nix/store/gvmrpcdrm4mwpy6yyxsx122ynmjb1avn-libxcb-1.15/lib/libxcb-present.so.0 (0x00007fb537adc000)
        libxcb-sync.so.1 => /nix/store/gvmrpcdrm4mwpy6yyxsx122ynmjb1avn-libxcb-1.15/lib/libxcb-sync.so.1 (0x00007fb537ad4000)
        libxshmfence.so.1 => /nix/store/h26hqp38qv41rkb640hg3zn2hr06kw4w-libxshmfence-1.3.2/lib/libxshmfence.so.1 (0x00007fb537acd000)
        libm.so.6 => /nix/store/9la894yvmmksqlapd4v16wvxpaw3rg70-glibc-2.37-8/lib/libm.so.6 (0x00007fb5379ed000)
        libgcc_s.so.1 => /nix/store/ci51zm09w9skb92zkc5x9x2vr1pkb0h6-gcc-12.3.0-lib/lib/libgcc_s.so.1 (0x00007fb5379cc000)
        libc.so.6 => /nix/store/9la894yvmmksqlapd4v16wvxpaw3rg70-glibc-2.37-8/lib/libc.so.6 (0x00007fb5377e6000)
        libXau.so.6 => /nix/store/d2jlkszkl2jwiqvkp9647arfd9r0ma1h-libXau-1.0.11/lib/libXau.so.6 (0x00007fb5377e1000)
        libXdmcp.so.6 => /nix/store/fk3m0m22alz25vr7cmpdfzwldqg1pf8l-libXdmcp-1.1.4/lib/libXdmcp.so.6 (0x00007fb5377d7000)
        libffi.so.8 => /nix/store/86wfcbx4zg1dmgrfs2d54flvsdadgb0p-libffi-3.4.4/lib/libffi.so.8 (0x00007fb5377ca000)
        /nix/store/9la894yvmmksqlapd4v16wvxpaw3rg70-glibc-2.37-8/lib64/ld-linux-x86-64.so.2 (0x00007fb537c41000)

And I did not even look at the transitive dependencies here. Yeah, overall, you better should prototype this first!

Anyways, if you want to have a chat about that or just need a rubberduck, don't hesitate to contact me on Matrix or IRC ;)

Good luck!

[^1]: I'm talking about those ones: https://github.com/numtide/nix-gl-host/blob/main/src/nixglhost.py#L209

Fuuzetsu commented 1 year ago

Thanks for the detailed answer. I'll be sure to update when(if!) I end up doing something from here. I'll of course give credit where credit is due if anything ends up happening...