jupyter-xeus / xeus-octave

Jupyter kernel for GNU Octave
https://xeus-octave.readthedocs.io/
GNU General Public License v3.0
57 stars 10 forks source link

Add GLAD loader and remove osmesa #70

Closed rapgenic closed 2 years ago

rapgenic commented 2 years ago

Hi while trying to build the project I noticed that it won't build if I'm not using osmesa, and this is probably because the conda gcc compiler cannot find the system GL libraries.

With this commit I backported from my development branch the glad (OpenGL loader) integration, so that the openGL libraries do not need to be linked at build time.

Furthermore, the Osmesa build option should no longer be necessary! I had added that only for the purpose of running the headless binder instance (which does not have a graphics card and thus no (HW) GL implementation). But with the use of xvfb (great job on that by the way!) the GL virtualisation is performed by the virtual X server in a totally transparent way and osmesa can be discarded.

In fact I've been able to run this build both in a container and locally on my PC, using either my graphics card or a totally headless setup.

On a side note, I also backported a little modification regarding the plotstream, which is now a string instead of an integer as was before. I believe that's cleaner (no need of hacky conversions between integer sizes).

AntoinePrv commented 2 years ago

Thanks @rapgenic, indeed the OpenGL support is still a bit flaky... I did not learn about GLAD yet but it seems better suited to the task!

I noticed that it won't build if I'm not using osmesa, and this is probably because the conda gcc compiler cannot find the system GL libraries.

I had similar problems, though it was CMake not finding the libraries. Weirdly when I removed find_package(OpenGL), it would still compile and link fine :shrug:

Furthermore, the Osmesa build option should no longer be necessary! I had added that only for the purpose of running the headless binder instance (which does not have a graphics card and thus no (HW) GL implementation). But with the use of xvfb (great job on that by the way!) the GL virtualisation is performed by the virtual X server in a totally transparent way and osmesa can be discarded.

Really? I thought it was to separate things, no GPU => Osmesa and no screen => xvbf. Well, even better then!

In fact I've been able to run this build both in a container and locally on my PC, using either my graphics card or a totally headless setup.

:tada:

AntoinePrv commented 2 years ago

If it's OK with you, I'll take the time that this works with conda-forge and binder (need to make a dev release there). I can also try on MacOS. Current code is not working properly, so if it does, it's an improvement!

rapgenic commented 2 years ago

I had similar problems, though it was CMake not finding the libraries. Weirdly when I removed find_package(OpenGL), it would still compile and link fine

Strange but actually rings a bell, so it might have happened to me in the past

Really? I thought it was to separate things, no GPU => Osmesa and no screen => xvbf. Well, even better then!

Let's fist say that the world of GL etc. is still often a mystery to me, and I don't fully understand how everything is working, but I believe that xvfb somehow uses osmesa in the background and tells "automatically" the programs to use that OpenGL implementation.

If it's OK with you, I'll take the time that this works with conda-forge and binder (need to make a dev release there).

Sure that's better, I didn't do that because I haven't still learnt how to use properly conda-forge without making a mess...

However I expect it to work, as I've tried it in my devcontainer (similar to binder instance) with and without using xvfb and in the first case everything works, in the second the kernel crashes as long as I try to draw a plot with GLFW errors.

I can also try on MacOS. Current code is not working properly, so if it does, it's an improvement!

I hope it will, although I'm not sure, I've been reading somewhere that MacOS does not support OpenGL.

rapgenic commented 2 years ago

Really? I thought it was to separate things, no GPU => Osmesa and no screen => xvbf. Well, even better then!

You might actually be right, I just discovered that the graphics card seems to be available inside the devcontainer so it would mean that I'm still not using a software GL and it wouldn't really work on binder... that's a bummer.

I'll try with a virtual machine to be sure

rapgenic commented 2 years ago

Ok I've done some digging with binder, and I've discovered that in reality it has a (virtual) graphics card! (Or at least a functioning OpenGL implementation not inside the system). In fact the OpenGL vendor seems to be VMware, Inc.. This means that the only problem for running octave in binder is the absence of an X server, which is fixed by using xvfb (which, you were correct, does nothing for virtualising OpenGL).

This is coherent also with the fact that the conda built GLFW does not have support for osmesa so it cannot be using it (this was the reason I built glfw in tree).

This might mean that GLAD might not be necessary either, although I think it's more correct to have a loader because of all the different OpenGL API versions.

AntoinePrv commented 2 years ago

There is also apt-get libgl1-mesa-dev in Binder that was installed, maybe that's the vendor.

Another difficulty I had was that libocinterp links with opengl itself (even if, as I understand, it is not doing any OpenGL with Xeus-Octave).

rapgenic commented 2 years ago

Ok so to have a clearer picture of what's happening I've added a few lines to have more debug information on OpenGL.

On my devcontainer I get the following:

OpenGL vendor: Mesa/X.org
OpenGL renderer: llvmpipe (LLVM 13.0.1, 256 bits)
OpenGL version: 4.5 (Compatibility Profile) Mesa 22.0.5

Also running lsof -p <xoctave PID> | grep GL to see the path of the loaded library I get

xoctave 63763 mambauser  mem       REG               0,34           116768 /usr/lib/x86_64-linux-gnu/libGLX_mesa.so.0.0.0 (path dev=0,62)
xoctave 63763 mambauser  mem       REG               0,34           116765 /usr/lib/x86_64-linux-gnu/libGLX.so.0.0.0 (path dev=0,62)
xoctave 63763 mambauser  mem       REG               0,34           116770 /usr/lib/x86_64-linux-gnu/libGLdispatch.so.0.0.0 (path dev=0,62)
xoctave 63763 mambauser  mem       REG               0,34            35766 /opt/conda/lib/libGLU.so.1.3.1 (path dev=0,62)
xoctave 63763 mambauser  mem       REG               0,34           116763 /usr/lib/x86_64-linux-gnu/libGL.so.1.7.0 (path dev=0,62)

On my PC instead I get this:

OpenGL vendor: Intel
OpenGL renderer: Mesa Intel(R) Xe Graphics (TGL GT2)
OpenGL version: 4.6 (Compatibility Profile) Mesa 22.1.7

And the lsof output is:

xoctave 98361 ggirardi  mem       REG               0,34           458511 /usr/lib64/libGLX_mesa.so.0.0.0 (path dev=0,36)
xoctave 98361 ggirardi  mem       REG               0,34           453227 /usr/lib64/libGLdispatch.so.0.0.0 (path dev=0,36)
xoctave 98361 ggirardi  mem       REG               0,34           458519 /usr/lib64/libGLX.so.0.0.0 (path dev=0,36)
xoctave 98361 ggirardi  mem       REG               0,34          1771067 /home/ggirardi/micromamba/lib/libGLU.so.1.3.1 (path dev=0,40)
xoctave 98361 ggirardi  mem       REG               0,34           458517 /usr/lib64/libGL.so.1.7.0 (path dev=0,36)

And it's quite interesting that both use the system libraries, however inside the container the renderer happens to be llvmpipe, which is a software renderer! Which apparently already existed in mesa without even needing osmesa?? What a mess!

rapgenic commented 2 years ago

I tried this on the binder instance as well and i got the following:

OpenGL vendor: VMware, Inc.
OpenGL renderer: llvmpipe (LLVM 10.0.0, 256 bits)
OpenGL version: 3.1 Mesa 20.0.8

There is no lsof however.

Anyway we can clearly see that it doesn't really have a GPU, and it is using a software renderer. I've no idea however as why the vendor is VMware, Inc. and not Mesa/X.org...

AntoinePrv commented 2 years ago

This seems to be working properly with Binder! No success on MacOS, the difficulty is there is some form of segfault but I have a hard time getting to the process, which is either hidden behind jupyter or pytest...

rapgenic commented 2 years ago

That's great!

Regarding the problems with macOS you could try debugging the kernel outside jupyter by creating a kernel.json file

{
    "transport": "tcp",
    "ip": "127.0.0.1",
    "control_port": 45913,
    "shell_port": 39991,
    "stdin_port": 48143,
    "iopub_port": 59023,
    "hb_port": 36923,
    "signature_scheme": "hmac-sha256",
    "key": "c1319e5b-8d6b5362d70e37f6dc4aceab"
}

Running for example the kernel with gdb gdb xoctave -f kernel.json and then attaching the jupyter lab or the console as reported at kernel boot.

Otherwise you could attach the debugger by PID.