openai / roboschool

DEPRECATED: Open-source software for robot simulation, integrated with OpenAI Gym.
Other
2.12k stars 487 forks source link

Quickfix for OpenGL initialization #15

Open benelot opened 7 years ago

benelot commented 7 years ago

I followed the installation instructions and everything went well so far. However, now I tried to run the pretrained RoboschoolAnt from the agent_zoo, but it fails to run. The following is the output:

python3 ./RoboschoolAnt_v0_2017apr.py 2017-05-31 21:17:53.304307: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:901] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2017-05-31 21:17:53.304637: I tensorflow/core/common_runtime/gpu/gpu_device.cc:887] Found device 0 with properties: 
name: Quadro K2100M
major: 3 minor: 0 memoryClockRate (GHz) 0.6665
pciBusID 0000:01:00.0
Total memory: 1.95GiB
Free memory: 1.52GiB
2017-05-31 21:17:53.304659: I tensorflow/core/common_runtime/gpu/gpu_device.cc:908] DMA: 0 
2017-05-31 21:17:53.304664: I tensorflow/core/common_runtime/gpu/gpu_device.cc:918] 0:   Y 
2017-05-31 21:17:53.304685: I tensorflow/core/common_runtime/gpu/gpu_device.cc:977] Creating TensorFlow device (/gpu:0) -> (device: 0, name: Quadro K2100M, pci bus id: 0000:01:00.0)
[2017-05-31 21:17:53,304] Making new env: RoboschoolAnt-v0
QGLShaderProgram: could not create shader program
bool QGLShaderPrivate::create(): Could not create shader of type 2.
python3: render-simple.cpp:217: void SimpleRender::Context::initGL(): Assertion `r0' failed.
Aborted

After some research, it seems that OpenGL is not properly loaded. I could fix the problem by adding the following line to the imports:

from OpenGL import GLU

I could properly run the script after adding this. Maybe this import should somehow be included. I am looking a bit more into this.Tell me if you need any other information.

benelot commented 7 years ago

There is a pullrequest here #14 that seems to solve a similar issue.

olegklimov commented 7 years ago

To import OpenGL doesn't look as correct solution to me. As a quickfix, yes. But these environments without RGB observations can run without X server.

Did you try to find why this happens?

jdarpinian commented 7 years ago

I also have this issue. The quick fix also works for me. Perhaps we are seeing different behavior because you may be using the Nvidia driver installer from nvidia.com; instead I am using the Nvidia binary driver packaged by Ubuntu (version 381.22 which is the most recent released by Nvidia). One thing I have noticed is that I do not have a file /usr/lib/x86_64-linux-gnu/libGLX_nvidia.so.0 (as tested for by the Makefile).

Another thing I noticed is that you are doing glcx = QOpenGLContext::globalShareContext(); and later glcx->makeCurrent(...) which is explicitly forbidden by the documentation for globalShareContext: https://doc.qt.io/qt-5/qopenglcontext.html#globalShareContext . I don't know if that's a problem but it looks suspicious.

tdavchev commented 7 years ago

It seems this is an existing issue with the Nvidia drivers for Linux distributions - https://bugs.launchpad.net/ubuntu/+source/python-qt4/+bug/941826.

In summary, the problem comes from when python dynamically loads the required OpenGL libraries. It loads the Messa GL library as opposed to the Nvidia driver one unless PyOpenGL is loaded first.

A workaround is to import GL as so from OpenGL import GL within a zoo file.

olegklimov commented 7 years ago

On my system, I have:

$ dpkg -S /usr/lib/x86_64-linux-gnu/libGL.so
libgl1-mesa-dev:amd64: /usr/lib/x86_64-linux-gnu/libGL.so

os it is indeed from Mesa, but it is only a link:

$ ls -l /usr/lib/x86_64-linux-gnu/libGL.so
/usr/lib/x86_64-linux-gnu/libGL.so -> libGL.so.1

and libGL.so.1 is from NVidia binary drivers package. And it works.

Can someone confirm binary drivers from NVidia (not from Ubuntu repo) and still this bug present?

tdavchev commented 7 years ago

In my case $ ls -l /usr/lib/x86_64-linux-gnu/libGL.so points to /usr/lib/x86_64-linux-gnu/libGL.so -> mesa/libGL.so. I changed the symlink to point to /usr/lib/x86_64-linux-gnu/libGL.so -> ../nvidia-381/libGL.so.1 but I still get the same error.

^^^ Fix: Reinstall OpenGL through pip3 install PyOpenGL PyOpenGL_accelerate then update the symlink pointing to Nvidia's libGL.so as so:

$ sudo rm /usr/lib/x86_64-linux-gnu/libGL.so

$ sudo ln -s /usr/lib/x86_64-linux-gnu/libGL.so.1 /usr/lib/x86_64-linux-gnu/libGL.so
olegklimov commented 7 years ago

Wow I don't even have /usr/lib/nvidia-XXX folder.

It seems you're using binary driver packaged by Ubuntu.

We definitely need some kind of workaround for this.

@yadrimz can you experiment a bit more? You can check what .so it loads in /proc/PID/maps, then try to rename/relink it. Or maybe some other way.

Another possible workaround is to put from OpenGL import GLU to all render handlers. There are not many of them, it's better compared to inserting it to every executable script. For two reasons: 1) less clutter, 2) doesn't require OpenGL unless render() called. Another way is to explicitly link against specific library in Makefile, probably the best way to do it.

tdavchev commented 7 years ago

@olegklimov Reinstalling PyOpenGL created a libGL.so.1 within x86_64-linux-gnu folder, after that modifying the symlink for libGL.so to point to libGL.so.1 worked

Nevertheless, the /proc/PID/maps were pointing to the right direction with and without the zoo import. I didn't get to the point of trying the rest of the solutions given that the reinstall worked. Regardless, thanks for the keen help!

olegklimov commented 7 years ago

Wow. That's even more surprising. PyOpenGL is supposed to be a wrapper around C library, not a C library itself! If it really installs or modifies libGL.so.1 that would be interesting.

And we still have no solution to put into README :(

olegklimov commented 7 years ago

I also didn't find it in Ubuntu. What package manager do you use?

https://packages.ubuntu.com/search?keywords=PyOpenGL&searchon=names&suite=all&section=all

And if you use pip3 install, it works from user account, there's no way it modified /usr/lib/something, you need root for that, do you?

tdavchev commented 7 years ago

I briefly peeked through the source code of PyOpenGL and it seems that it copies usr/lib/-path-to-Nvidia-lib-/libGL.so.1 to usr/lib/-architecture-specific-path-/libGL.so.1 forcing all references to go through the platform module (i.e. get_sharedlib() in openglgenerator.py). Although I still don't understand why pointing the symlink to the original library file won't work.

Moreover, these two references: http://web.eecs.umich.edu/~sugih/courses/eecs487/glut-howto/ and http://glew.sourceforge.net/install.html suggest that pyopengl does namely that too.

In terms of package manager I use the standard apt-get that comes with Ubuntu 16.04. Nevertheless, you are absolutely right, you need root rights to be able to do that else you're doomed :(

ShuoYangRobotics commented 7 years ago

on my system:

$ dpkg -S /usr/lib/x86_64-linux-gnu/libGL.so
libgl1-mesa-dev:amd64: /usr/lib/x86_64-linux-gnu/libGL.so

and

$ ls -l /usr/lib/x86_64-linux-gnu/libGL.so
lrwxrwxrwx 1 root root 13 Jan  26 08:17 /usr/lib/x86_64-linux-gnu/libGL.so -> mesa/libGL.so

I encountered this problem, then install pyopengl by sudo pip3 install pyopengl

Problem solved.

My system is Ubuntu 16.04 and Nvidia driver is installed together with cuda driver. I guess use root rights is important along the way.

liuminggao commented 6 years ago

This method is solved my problem @benelot

dmitrinesterenko commented 6 years ago

Ubuntu 16.04, Python 3 I had issues on the cmake step with an OPENGL NOT FOUND error after installing libraries with the outlined apt-get install steps and running pip3 install pyopengl.

What worked to resolve was finding the libgl.so that actually links to the appropriate libgl on my system: sudo find / -name libgl.so and then adding the folder path to my PATH. After doing this the command was able to find OPENGL and all was well.

cmake -DBUILD_SHARED_LIBS=ON -DUSE_DOUBLE_PRECISION=1 -DCMAKE_INSTALL_PREFIX:PATH=$ROBOSCHOOL_PATH/roboschool/cpp-household/bullet_local_install -DBUILD_CPU_DEMOS=OFF -DBUILD_BULLET2_DEMOS=OFF -DBUILD_EXTRAS=OFF  -DBUILD_UNIT_TESTS=OFF -DBUILD_CLSOCKET=OFF -DBUILD_ENET=OFF -DBUILD_OPENGL3_DEMOS=OFF ..