graphdeco-inria / gaussian-splatting

Original reference implementation of "3D Gaussian Splatting for Real-Time Radiance Field Rendering"
https://repo-sam.inria.fr/fungraph/3d-gaussian-splatting/
Other
13.77k stars 1.78k forks source link

Accessing SIBR Viewer through X11 on Mac or XMing on Windows #61

Closed ShehriyarShariq closed 1 year ago

ShehriyarShariq commented 1 year ago

Hi. I have the SIBR Viewer setup on an Ubuntu 22.04 machine which fully satisfies the mentioned requirements (hardware and software). The required dependencies for the SIBR viewer are there and the build finished successfully.

In order to run the GUI-based Remote Gaussian SIBR Viewer, I SSH'ed into my server using X11 on MacOS and XMing (with PuTTY) on Windows with the -X flag in the ssh command and ensured that the DISPLAY env variable is properly configured.

Upon running the following command: ./SIBR_remoteGaussian_app command, I get the following output:

MacOS

[SIBR] --  INFOS  --:   Initialization of GLFW
libGL error: No matching fbConfigs or visuals found
libGL error: failed to load driver: swrast
[SIBR] ##  ERROR  ##:   FILE /home/paperspace/gaussian-splatting/SIBR_viewers/src/core/graphics/Window.cpp
            LINE 30, FUNC glfwErrorCallback
            GLX: An OpenGL profile requested but GLX_ARB_create_context_profile is unavailable
terminate called after throwing an instance of 'std::runtime_error'

Window

[SIBR] --  INFOS  --:   Initialization of GLFW
[SIBR] ##  ERROR  ##:   FILE /home/paperspace/gaussian-splatting/SIBR_viewers/src/core/graphics/Window.cpp
            LINE 30, FUNC glfwErrorCallback
            GLX: GLX version 1.3 is required
terminate called after throwing an instance of 'std::runtime_error'

Server Specifications:

OS: Ubuntu 22.04
GPU: A100
PatSoar3D commented 1 year ago

I am also getting similar issues when trying to load up the viewer. Can confirm original build/setup is working on my ubuntu 22.04 machine, but as soon as I try to use any tool to access the GUI im getting hit with one of these issues.

rasheedsaqib commented 1 year ago

I am also facing the same issue when trying to load up the viewer. Just to confirm, the original build/setup is working fine on my Ubuntu 22.04 machine. However, as soon as I attempt to use any tool to access the GUI, I encounter the same problem mentioned in the issue. It's affecting my workflow, and I'm eager to see a resolution for this.

Snosixtyboo commented 1 year ago

Hi,

the SIBR framework needs OpenGL 4.5+. Are the glxutils installed on the server? Are the mesa package and opengl drivers up to date? What do you get when you run

glxinfo | grep "version"

?

ShehriyarShariq commented 1 year ago

Executing the above commands returns the following output:

On Ubuntu 22.04

libGL error: No matching fbConfigs or visuals found
libGL error: failed to load driver: swrast
X Error of failed request:  BadValue (integer parameter out of range for operation)
  Major opcode of failed request:  149 (GLX)
  Minor opcode of failed request:  24 (X_GLXCreateNewContext)
  Value in failed request:  0x0
  Serial number of failed request:  26
  Current serial number in output stream:  27

The mesa package was also successfully installed.

The server has a Nvidia GPU and is the latest of the available ones, hence, it is OpenGL 4.5+ ready, however, running any command with regard to OpenGL still gives out an error. I've also tried updating the recommended Nvidia drivers to the latest ones, but no luck with that.

Snosixtyboo commented 1 year ago

Ok, being unable to run OpenGL apps would create an issue. Unfortunately, I have very little experience with running OpenGL apps on Linux machines over X servers. But I would assume that if you can find a fix for this (maybe enforcing software rendering through MESA will do the trick?), the viewer should work.

Hth, Bernhard

Snosixtyboo commented 1 year ago

Maybe just export LIBGL_ALWAYS_SOFTWARE=1 will do the trick.

ShehriyarShariq commented 1 year ago

The same output as before. The issue at this point might be due to the Nvidia drivers messing up with the configuration for Mesa. Will report back if that fixes the issue, in case someone goes through the same issue in the future.

The following are the Nvidia driver specs for reference:

Nvidia Driver Specs

PatSoar3D commented 1 year ago

Ok, being unable to run OpenGL apps would create an issue. Unfortunately, I have very little experience with running OpenGL apps on Linux machines over X servers. But I would assume that if you can find a fix for this (maybe enforcing software rendering through MESA will do the trick?), the viewer should work.

Hth, Bernhard

Did you guys use a desktop image of Ubuntu 22.04 in your testing?

Snosixtyboo commented 1 year ago

Nope, we didn't have a dedicated Ubuntu machine. The only time we tested the remote viewer through X (which we believed to be a rare use case: ideally the viewer runs locally and just connects to the remote instance), we connected to a WSL Ubuntu instance from Windows.

PatSoar3D commented 1 year ago

[PROGRESS]

Solved some of the issues on my end regarding OpenGL's drivers failing to be recognized, had to do with a few steps.

Firstly the mesa libraries seem to require installs for 32-bit architecture, which can be allowed on a 64-bit machine but isn't by default. First I added support for the architecture type:

sudo dpkg --add-architecture i386 sudo apt-get update

Then I installed support for the i386 architectureL

sudo apt-get install libnvidia-gl-525:i386

Once this ran successfully I then had to identify where the default symlinks were being pointed for the libGL.so.1 library. Since it was pointed to mesa by default, we needed to remove the symlink, & relink the library to the NVIDIA GPU, which I ran with this command:

sudo ln -s libGLX_nvidia.so.525.125.06 libGL.so.1

From here I could successfully verify the Open GL Installation:

glxinfo | grep "OpenGL version"

Output: OpenGL version string: 4.6.0 NVIDIA 535.54.03

[CONTINUED ISSUE] Although the fbconfigs & swrast problems have now been addressed & can be considered solved, I'm still finding myself met with an error when trying to launch ./SIBR_remoteGaussian_app.

Error: [SIBR] -- INFOS --: Initialization of GLFW [SIBR] ## ERROR ##: FILE /home/gaussian-splatting/SIBR_viewers/src/core/graphics/Window.cpp LINE 30, FUNC glfwErrorCallback GLX: An OpenGL profile requested but GLX_ARB_create_context_profile is unavailable terminate called after throwing an instance of 'std::runtime_error' what(): See log for message errors Aborted (core dumped)

@Snosixtyboo any thoughts after the process of elimination?

Snosixtyboo commented 1 year ago

Hmm, not much comes to mind, this is territory I'm not very familiar with. What is the output of just glxinfo | grep "version"? Can you run glxgears? Have you tried again with export LIBGL_ALWAYS_SOFTWARE=1 after the updates?

PatSoar3D commented 1 year ago

Output of glx info is:

server glx version string: 1.4 client glx version string: 1.4 GLX version: 1.4 OpenGL version string: 4.6.0 NVIDIA 535.54.03 OpenGL shading language version string: 4.60 NVIDIA

glxgears gives an error:

Error: couldn't get an RGB, Double-buffered visual

& even after running export LIBGL_ALWAYS_SOFTWARE=1

I still get the same issue as above.

I have been able to (inconsistently) solve the openGL profile error by tinkering the glfwWindowHint function calls in the Window.cpp under SIBRViewers/src/core/graphics/windows.cpp, which is where I set the opengl versions to 4.6 instead of 4.5, and switched GLFW_OPENGL_CORE_PROFILE to GLFW_OPENGL_COMPAT_PROFILE

Although this doesn't always solve the above bug, the times that I am able to get passed the above issue regarding the OpenGL profile, I get hit with:

## ERROR ##: FILE /home/gaussian-splatting/SIBR_viewers/src/core/graphics/Window.cpp 
LINE 30, FUNC glfwErrorCallback X11: Failed to open display ps2mx83fl:10.0 terminate called after throwing an instance of 'std::runtime_error' 
what(): See log for message errors
 Aborted (core dumped)
PatSoar3D commented 1 year ago

I also created a small c++ program to try and verify if there is compatibility for 'GLX_ARB_create_context_profile' issues,

#include <GL/glx.h>
#include <X11/Xlib.h>
#include <iostream>
#include <cstring>

int main() {
    Display *disp = XOpenDisplay(NULL);
    int screen = DefaultScreen(disp);
    int nelements;
    GLXFBConfig* fbc = glXGetFBConfigs(disp, screen, &nelements);
    GLXContext ctx = glXCreateNewContext(disp, fbc[0], GLX_RGBA_TYPE, 0, True);
    const char *glxExts = glXQueryExtensionsString(disp, screen);
    bool has_ARB_create_context_profile = strstr(glxExts, "GLX_ARB_create_context_profile") != 0;

    if (has_ARB_create_context_profile) {
        std::cout << "Your system's GLX supports the ARB_create_context_profile extension." << std::endl;
    } else {
        std::cout << "Your system's GLX does not support the ARB_create_context_profile extension." << std::endl;
    }

    XFree(fbc);
    glXDestroyContext(disp, ctx);
    XCloseDisplay(disp);

    return 0;
}

Although this always outputs `Your system's GLX does not support the ARB_create_context_profile extension.' I am still able to occasionally circumvent it above and meet the X11 error instead.

I highly doubt it, but is there something with the Ampere architecture that would be incompatible with the necessary tools the SIBR viewer is using from openGL?

Gotta love dependency rabbit holes, appreciate your time thinking through this one @Snosixtyboo

Snosixtyboo commented 1 year ago

Hi,

I'm pretty sure it's not due to the Ampere architecture. It's more likely some incompatibility of drivers/glx/mesa/x11, but at which point it occurs I can't tell. The fact that glxgears fails is a problem, that's the goto basic example to confirm that OpenGL can work at all.

Snosixtyboo commented 1 year ago

Is there somebody maintaining this server? Could they support you to get OpenGL applications to run properly in remote?

PatSoar3D commented 1 year ago

hey @Snosixtyboo

i am maintaining this machine, but was able to get OpenGL applications running remotely with some virtualGL workarounds.

I can confirm that all openGL applications work and are displaying windows across the xserver I am ssh'ing in with. As now glxgears runs the intended output.

After succesfully reinstalling the repo & setting up the SIBR libraries, train & render ran as intended for the repository. However when trying to load up any of the viewer apps with sudo vglrun ./SIBR_remoteGaussian_app I am now hit consistently with this error output:

[SIBR] --  INFOS  --:   Initialization of GLFW
X Error of failed request:  BadMatch (invalid parameter attributes)
  Major opcode of failed request:  130 (MIT-SHM)
  Minor opcode of failed request:  3 (X_ShmPutImage)
  Serial number of failed request:  50
  Current serial number in output stream:  51

Any chance this helps on your end narrow down where the problem is stemming from?

ShehriyarShariq commented 1 year ago

@Snosixtyboo From what I've understood, MIT-SHM does not work with X11 as explained in this link: https://unix.stackexchange.com/questions/534314/working-with-the-mit-shm-x11-extension-on-linux

Now, can you let me know of an alternate way of streaming the images that were meant to be streamed by the SIBR Viewer, over a socket instead. Sure, the transmission will definitely be slow as compared to the one being facilitated by MIT-SHM, but at least there would be something visible to work with and improve upon.

Snosixtyboo commented 1 year ago

Hi,

the remote viewer already is a socket, i.e., it should be capable to make connections to remote PCs. I guess I simply didn't anticipate users trying to use it through SSH instead with IPs directly. So hopefully you could somewhat easily decipher the packages that are being sent to it. The message format being sent is in the gaussian_renderer/network_gui.py. It is pretty straightforward, except for the cameras. You could modify it to render specific camera indices rather than the remote viewer's free-moving camera during training with some slight modifications.

Hth, Bernhard

Coder-ZZY commented 11 months ago

[SIBR] -- INFOS --: Initialization of GLFW [SIBR] ## ERROR ##: FILE /mnt/sda/gaussian-splatting/SIBR_viewers/src/core/graphics/Window.cpp LINE 30, FUNC glfwErrorCallback Linux: Failed to watch for joystick connections in /dev/input: No space left on device terminate called after throwing an instance of 'std::runtime_error' what(): See log for message errors Aborted (core dumped)

xiazhi1 commented 11 months ago

[SIBR] -- INFOS --: Initialization of GLFW [SIBR] ## ERROR ##: FILE /mnt/sda/gaussian-splatting/SIBR_viewers/src/core/graphics/Window.cpp LINE 30, FUNC glfwErrorCallback Linux: Failed to watch for joystick connections in /dev/input: No space left on device terminate called after throwing an instance of 'std::runtime_error' what(): See log for message errors Aborted (core dumped)

Same Bug for me In Ubuntu22.04, have you solved it? @Coder-ZZY

Coder-ZZY commented 11 months ago

[SIBR] -- INFOS --: Initialization of GLFW [SIBR] ## ERROR ##: FILE /mnt/sda/gaussian-splatting/SIBR_viewers/src/core/graphics/Window.cpp LINE 30, FUNC glfwErrorCallback Linux: Failed to watch for joystick connections in /dev/input: No space left on device terminate called after throwing an instance of 'std::runtime_error' what(): See log for message errors Aborted (core dumped)

Same Bug for me In Ubuntu22.04, have you solved it? @Coder-ZZY @xiazhi1 I use this link method: https://github.com/glfw/glfw/issues/833, but I also meet another problem: sometimes the error is:

[SIBR] --  INFOS  --:   Initialization of GLFW
Segmentation fault (core dumped)

and sometimes the error is: X Error of failed request: GLXBadContextTag Major opcode of failed request: 146 (GLX) Minor opcode of failed request: 5 (X_GLXMakeCurrent) Serial number of failed request: 116 Current serial number in output stream: 116 I don't know how to solve it [sad].

elenacliu commented 8 months ago

Any solutions now? 🥹🥹🥹

ERGOWHO commented 3 months ago

Any one who got this bug: [SIBR] -- INFOS --: Initialization of GLFW [SIBR] ## ERROR ##: FILE /home/gaussiansplatting/SIBR_viewers/src/core/graphics/Window.cpp LINE 30, FUNC glfwErrorCallback Linux: Failed to watch for joystick connections in /dev/input: No space left on device terminate called after throwing an instance of 'std::runtime_error' what(): See log for message errors Aborted (core dumped)

This will help you solve the bug: https://github.com/glfw/glfw/issues/833 in this issues what you really need to do is: sudo sysctl fs.inotify.max_user_watches=32768

My SIBR works with this!

DJNing commented 1 month ago

any updates on this issue now?