jr-robotics / robo-gym

An open source toolkit for Distributed Deep Reinforcement Learning on real and simulated robots.
https://sites.google.com/view/robo-gym
MIT License
390 stars 74 forks source link

Issues running td3_script.py #22

Closed bungee31 closed 3 years ago

bungee31 commented 3 years ago

Hi,

I am trying to run the td3_script.py from docs/examples. During the training, it raises an exception:

Traceback (most recent call last): File "td3_script.py", line 17, in model.learn(total_timesteps=15000) File "/home/andrei/.local/lib/python3.6/site-packages/stable_baselines/td3/td3.py", line 330, in learn new_obs, reward, done, info = self.env.step(unscaled_action) File "/home/andrei/devarea/gym_ws/robo-gym/robo_gym/wrappers/exception_handling.py", line 9, in step observation, reward, done, info = self.env.step(action) File "/home/andrei/.local/lib/python3.6/site-packages/gym/wrappers/time_limit.py", line 16, in step observation, reward, done, info = self.env.step(action) File "/home/andrei/devarea/gym_ws/robo-gym/robo_gym/envs/mir100/mir100.py", line 125, in step assert self.action_space.contains(action), "%r (%s) invalid" % (action, type(action)) AssertionError: array([-1.0000001, 0.6030375], dtype=float32) (<class 'numpy.ndarray'>) invalid

Apparently the action range is not covered. But is it a part of the TD3 implementation under the hood, or it can be fixed straightforward?

friedemannzindler commented 3 years ago

Hi!

Thank you thank you for reaching out to us. Unfortunately, I am having trouble reproducing this error. I just ran the td3_script a few times and had no problems.

How often does this error occur? Does it occur immediately after starting the training?

If it occurs very rarely and you have installed robo-gym in editable mode, a quick fix might be to clip the action in the step function again. (np.clip(action, -1.0, 1.0))