ARISE-Initiative / robosuite

robosuite: A Modular Simulation Framework and Benchmark for Robot Learning
https://robosuite.ai
Other
1.38k stars 429 forks source link

Error occurred when running GAIL #102

Closed haoyu-x closed 4 years ago

haoyu-x commented 4 years ago

Hi, Thanks for sharing this great project!

I'm using GAIfO from tf2rl (https://github.com/keiohta/tf2rl). I choose state-based obs, OSC-position controller and sawyer robot, robosuite v1.0

program will be interrupted by an Error:

20:40:50.382 [INFO] (irl_trainer.py:74) Total Epi:    26 Steps:   13000 Episode Steps:   500 Return:  20.0922 FPS:  9.42
20:41:43.588 [INFO] (irl_trainer.py:74) Total Epi:    27 Steps:   13500 Episode Steps:   500 Return:  21.9982 FPS:  9.40
Traceback (most recent call last):
File "run_gaifo_robosuite.py", line 134, in <module>
trainer()
File "/home/haoyux/tf2rl-master/tf2rl/experiments/irl_trainer.py", line 52, in call
nextobs, reward, done,  = self._env.step(action)
File "/home/haoyux/tf2rl-master/robosuite/wrappers/gym_wrapper.py", line 75, in step
ob_dict, reward, done, info = self.env.step(action)
File "/home/haoyux/tf2rl-master/robosuite/environments/base.py", line 225, in step
self.sim.step()
File "mujoco_py/mjsim.pyx", line 126, in mujoco_py.cymj.MjSim.step
File "mujoco_py/cymj.pyx", line 115, in mujoco_py.cymj.wrap_mujoco_warning.exit
File "mujoco_py/cymj.pyx", line 75, in mujoco_py.cymj.c_warning_callback
File "/home/haoyux/venv/lib/python3.6/site-packages/mujoco_py/builder.py", line 354, in user_warning_raise_exception
raise MujocoException(warn + 'Check for NaN in simulation.')
mujoco_py.builder.MujocoException: Unknown warning type Time = 17.3500.Check for NaN in simulation.

I'm wondering why this happened. Thanks!

cremebrule commented 4 years ago

Hi,

Glad you're finding a good usage for robosuite. :) Thanks for bringing up this issue. It looks like you're getting NaNs from the mujoco sim -- usually, this is due to high acceleration spikes in the simulation which in turn can be due to multiple factors. On the one hand, a badly trained policy or poorly tuned controller can often cause such instability, though this can also be caused / exacerbated by poorly tuned physical parameters (e.g: inertial values, damping, etc.)

I would recommend trying different robot / controller combinations and seeing which ones cause the NaN error -- isolating the source of what's causing the NaNs will be most helpful in terms of modifying and fixing the problem on your local branch.

The branch you currently have had a lot of bugs with the new grippers and robots that were added, and the majority of these major issues have already been cleaned up since then. We've been overhauling our v1.0 branch privately, and will be releasing it officially sometime soon. In the meantime, hopefully these comments help you out with your development and research!

I'm closing this issue for now, but once we release v1.0 officially feel free to open another issue there if your problem persists.