stepjam / RLBench

A large-scale benchmark and learning environment.
https://sites.google.com/corp/view/rlbench
Other
1.16k stars 235 forks source link

Arise Error signal 11 when running task.reset() #146

Closed KzZheng closed 3 years ago

KzZheng commented 3 years ago

Hi, I'm using Ubuntu 20.04, Coppeliasim 4.1, and python 3.7.8 in an anaconda environment. When I try to run examples/few_shor_rl.py headless, I get this issue:

Error: signal 11:

/home/kaizhi/Documents/CoppeliaSim_Edu_V4_1_0_Ubuntu20_04/libcoppeliaSim.so.1(_Z11_segHandleri+0x30)[0x7f7ec6beeae0]
/lib/x86_64-linux-gnu/libc.so.6(+0x46210)[0x7f7f85fe8210]
/home/kaizhi/Documents/CoppeliaSim_Edu_V4_1_0_Ubuntu20_04/libQt5Gui.so.5(_ZNK14QOpenGLContext10shareGroupEv+0x0)[0x7f7ec5458060]
/home/kaizhi/Documents/CoppeliaSim_Edu_V4_1_0_Ubuntu20_04/libQt5Gui.so.5(_ZN16QOpenGLFunctions25initializeOpenGLFunctionsEv+0x4b)[0x7f7ec5724a4b]
/home/kaizhi/Documents/CoppeliaSim_Edu_V4_1_0_Ubuntu20_04/libQt5Gui.so.5(_ZN24QOpenGLFramebufferObjectC1EiiNS_10AttachmentEjj+0xc8)[0x7f7ec5728a18]
/home/kaizhi/Documents/CoppeliaSim_Edu_V4_1_0_Ubuntu20_04/libsimExtOpenGL3Renderer.so(_ZN18CFrameBufferObjectC2Eii+0x5a)[0x7f7eb0cfe24a]
/home/kaizhi/Documents/CoppeliaSim_Edu_V4_1_0_Ubuntu20_04/libsimExtOpenGL3Renderer.so(_ZN16COpenglOffscreenC1EiiiP14QOpenGLContext+0x72)[0x7f7eb0cfe602]
/home/kaizhi/Documents/CoppeliaSim_Edu_V4_1_0_Ubuntu20_04/libsimExtOpenGL3Renderer.so(_Z21executeRenderCommandsbiPv+0x2550)[0x7f7eb0cfcb90]
/home/kaizhi/Documents/CoppeliaSim_Edu_V4_1_0_Ubuntu20_04/libcoppeliaSim.so.1(_ZN16CPluginContainer11extRendererEiPv+0x19)[0x7f7ec6db8249]
/home/kaizhi/Documents/CoppeliaSim_Edu_V4_1_0_Ubuntu20_04/libcoppeliaSim.so.1(_ZN13CVisionSensor24_extRenderer_prepareViewEi+0x347)[0x7f7ec6abf107]
QMutex: destroying locked mutex

I also try to use VirtualGL. But I got another issue:

No protocol specified
qt.qpa.xcb: could not connect to display :0.0
qt.qpa.plugin: Could not load the Qt platform plugin "xcb" in "/home/kaizhi/Documents/CoppeliaSim_Edu_V4_1_0_Ubuntu20_04" even though it was found.
This application failed to start because no Qt platform plugin could be initialized. Reinstalling the application may fix this problem.

Available platform plugins are: eglfs, linuxfb, minimal, minimalegl, offscreen, vnc, webgl, xcb.

Here is my nvidia-smi:

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 470.57.02    Driver Version: 470.57.02    CUDA Version: 11.4     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA RTX A6000    Off  | 00000000:01:00.0 Off |                  Off |
| 30%   26C    P8    12W / 300W |     27MiB / 48685MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   1  NVIDIA RTX A6000    Off  | 00000000:41:00.0 Off |                  Off |
| 30%   25C    P8     6W / 300W |     27MiB / 48685MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   2  NVIDIA RTX A6000    Off  | 00000000:81:00.0 Off |                  Off |
| 30%   26C    P8     8W / 300W |     27MiB / 48685MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   3  NVIDIA RTX A6000    Off  | 00000000:C1:00.0 Off |                  Off |
| 30%   25C    P8    16W / 300W |     34MiB / 48685MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A      3327      G   /usr/lib/xorg/Xorg                 23MiB |
|    1   N/A  N/A      3327      G   /usr/lib/xorg/Xorg                 23MiB |
|    2   N/A  N/A      3327      G   /usr/lib/xorg/Xorg                 23MiB |
|    3   N/A  N/A      3327      G   /usr/lib/xorg/Xorg                 25MiB |
|    3   N/A  N/A      3902      G   /usr/bin/gnome-shell                4MiB |
+-----------------------------------------------------------------------------+

I wonder how can I fix this bug? This issue is the same as #139. Can you have a look? @stepjam

KzZheng commented 3 years ago

Additionally, I can headless run the examples in the PyRep.

stepjam commented 3 years ago

Hi. That’s very strange that you don’t get this error when running PyRep, as the error seems to happen when starting CoppelisaSim. What PyRep example have you tried running? What are the full commands that you run for both PyReop and RLBench?

KzZheng commented 3 years ago

Hi. That’s very strange that you don’t get this error when running PyRep, as the error seems to happen when starting CoppelisaSim. What PyRep example have you tried running? What are the full commands that you run for both PyReop and RLBench?

I run example_batxter_pick_and_pass.py, and I just change the headless=True in line 18. My command for RLbench is python ./examples/few_shot_rl.py and python ./examples/example_baxter_pick_and_pass.py for PyRep. They are both running in the same conda environment.

stepjam commented 3 years ago

Thanks. This might sound strange, but could you go into the PyRep examples folder, and then run the example; I.e. python example_baxter_pick_and_pass.py. I want to make sure you are using the installed PyRep package, and not the local one (./pyrep).

KzZheng commented 3 years ago

This error happens after env.launch(). Actually, I think it happens in descriptions, obs = task.reset()

KzZheng commented 3 years ago

Thanks. This might sound strange, but could you go into the PyRep examples folder, and then run the example; I.e. python example_baxter_pick_and_pass.py. I want to make sure you are using the installed PyRep package, and not the local one (./pyrep).

I run this command, and this is what I get:

Planning path for left arm to cup ...
Executing plan ...
Planning path closer to cup ...
Traceback (most recent call last):
  File "/home/kaizhi/.conda/envs/rlbench_env/lib/python3.7/site-packages/pyrep/robots/arms/arm.py", line 381, in get_nonlinear_path
    max_configs, distance_threshold, max_time_ms, relative_to)
  File "/home/kaizhi/.conda/envs/rlbench_env/lib/python3.7/site-packages/pyrep/robots/arms/arm.py", line 157, in solve_ik_via_sampling
    'Could not find a valid joint configuration for desired '
pyrep.errors.ConfigurationError: Could not find a valid joint configuration for desired end effector pose.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "example_baxter_pick_and_pass.py", line 42, in <module>
    quaternion=waypoints[1].get_quaternion())
  File "/home/kaizhi/.conda/envs/rlbench_env/lib/python3.7/site-packages/pyrep/robots/arms/arm.py", line 447, in get_path
    relative_to)
  File "/home/kaizhi/.conda/envs/rlbench_env/lib/python3.7/site-packages/pyrep/robots/arms/arm.py", line 383, in get_nonlinear_path
    raise ConfigurationPathError('Could not create path.') from e
pyrep.errors.ConfigurationPathError: Could not create path.
QMutex: destroying locked mutex
stepjam commented 3 years ago

Ah, that’s helpful info (about it happening after the launch). So that makes me think that it’s rendering is the issue. Which would explain why PyRep examples work.

It might be helpful if we look at the CoppeliaSim output when it launches, as you might be missing some packages. Can you navigate to your CoppelaiSim home and run ./coppeliaSim.sh and then paste the full output please.

KzZheng commented 3 years ago

Ah, that’s helpful info (about it happening after the launch). So that makes me think that it’s rendering is the issue. Which would explain why PyRep examples work.

It might be helpful if we look at the CoppeliaSim output when it launches, as you might be missing some packages. Can you navigate to your CoppelaiSim home and run ./coppeliaSim.sh and then paste the full output please.

I'm running on a server without a monitor, so I don't if it will cause the problem. Here is it,

[CoppeliaSimClient]    loading the CoppeliaSim library...
[CoppeliaSimClient]    done.
[CoppeliaSimClient:loadinfo]   launching CoppeliaSim...
[CoppeliaSim:loadinfo]   CoppeliaSim V4.1.0., (rev. 1), flavor: 1
[CoppeliaSim:loadinfo]   Legacy machine ID: 5000-9DEB-FFC4-9CEC-F7E3-841D
[CoppeliaSim:loadinfo]   Machine ID: 9757-3B4E-5442-0000-8289-0101
[CoppeliaSim:loadinfo]   using the default Lua library.
[CoppeliaSim:loadinfo]   loaded the video compression library.
[CoppeliaSim:loadinfo]   if CoppeliaSim crashes now, try to install libgl1-mesa-dev on your system:
        >sudo apt install libgl1-mesa-dev

Error: signal 11:

/home/kaizhi/Documents/CoppeliaSim_Edu_V4_1_0_Ubuntu20_04/libcoppeliaSim.so(_Z11_segHandleri+0x30)[0x7f5a35b5bae0]
/lib/x86_64-linux-gnu/libc.so.6(+0x46210)[0x7f5a38a66210]
/lib/x86_64-linux-gnu/libc.so.6(+0x18b675)[0x7f5a38bab675]
/home/kaizhi/Documents/CoppeliaSim_Edu_V4_1_0_Ubuntu20_04/libcoppeliaSim.so(_Z15initGl_ifNeededv+0x11c)[0x7f5a35d01cac]
/home/kaizhi/Documents/CoppeliaSim_Edu_V4_1_0_Ubuntu20_04/libcoppeliaSim.so(_ZN11CMainWindowC2Ev+0x61d)[0x7f5a35fda90d]
/home/kaizhi/Documents/CoppeliaSim_Edu_V4_1_0_Ubuntu20_04/libcoppeliaSim.so(_ZN3App16createMainWindowEv+0x71)[0x7f5a35cf2531]
/home/kaizhi/Documents/CoppeliaSim_Edu_V4_1_0_Ubuntu20_04/libcoppeliaSim.so(_Z24simRunSimulator_internalPKciPFvvES2_S2_iS0_b+0x475)[0x7f5a35b70fe5]
/home/kaizhi/Documents/CoppeliaSim_Edu_V4_1_0_Ubuntu20_04/libcoppeliaSim.so(simRunSimulatorEx+0x13)[0x7f5a35b59493]
/home/kaizhi/Documents/CoppeliaSim_Edu_V4_1_0_Ubuntu20_04/coppeliaSim(+0x2ad6)[0x55e0a3882ad6]
/home/kaizhi/Documents/CoppeliaSim_Edu_V4_1_0_Ubuntu20_04/coppeliaSim(+0x251c)[0x55e0a388251c]
stepjam commented 3 years ago

If you read the output it tell you what to do 😄


if CoppeliaSim crashes now, try to install libgl1-mesa-dev on your system:
        >sudo apt install libgl1-mesa-dev
KzZheng commented 3 years ago

Besides, I already run sudo apt install libgl1-mesa-dev. It didn't solve the problem

stepjam commented 3 years ago

Are you able to run glxgears? This is a good way of debugging on a server.

KzZheng commented 3 years ago

If you read the output it tell you what to do smile


if CoppeliaSim crashes now, try to install libgl1-mesa-dev on your system:
        >sudo apt install libgl1-mesa-dev

Yes. I already run it. Here are my results:

Reading package lists... Done
Building dependency tree       
Reading state information... Done
libgl1-mesa-dev is already the newest version (21.0.3-0ubuntu0.3~20.04.3).
libgl1-mesa-dev set to manually installed.
0 upgraded, 0 newly installed, 0 to remove and 138 not upgraded.
KzZheng commented 3 years ago

Are you able to run glxgears? This is a good way of debugging on a server.

Here is the output of glxgears:

X Error of failed request:  BadValue (integer parameter out of range for operation)
  Major opcode of failed request:  151 (GLX)
  Minor opcode of failed request:  3 (X_GLXCreateContext)
  Value in failed request:  0x0
  Serial number of failed request:  32
  Current serial number in output stream:  33
stepjam commented 3 years ago

Thank you. So this is neither a PyRep or RLBench issue, but rather something is not setup correctly on your machine for correct rendering. One you get glxgears working, PyRep and RLBench should then also work.

stepjam commented 3 years ago

I Googled your error; perhaps this thread can help: https://askubuntu.com/questions/893922/ubuntu-16-04-gives-x-error-of-failed-request-badvalue-integer-parameter-out-o

KzZheng commented 3 years ago

Thanks! I will check that.

FB-wh commented 8 months ago

I had the exact same problem. How did you solve it?

obito8065 commented 4 months ago

Thanks. This might sound strange, but could you go into the PyRep examples folder, and then run the example; I.e. python example_baxter_pick_and_pass.py. I want to make sure you are using the installed PyRep package, and not the local one (./pyrep).

Excuse me, may I ask a question?I have a signal 11 same to the author.And when I run the '`python example_baxter_pick_and_pass.py' in the PyRep, I have a warning in the CoppeliaSim, like "Could not find or correctly load the video compression library". In additional , I can also run the CoppeliaSim after closing this warning.How to solve it?