benelot / pybullet-gym

Open-source implementations of OpenAI Gym MuJoCo environments for use with the OpenAI Gym Reinforcement Learning Research Platform.
https://pybullet.org/
Other
814 stars 124 forks source link

Variable 'done' returns an array of two boolean values. #43

Closed bryanbocao closed 3 years ago

bryanbocao commented 4 years ago

https://github.com/benelot/pybullet-gym/blob/79396fccc1d0e170a092b50137148da48ee2edab/pybulletgym/envs/mujoco/envs/pendulum/inverted_pendulum_env.py#L32

Variable 'done' returns an array of two boolean values.

Credit to Xiang Li.

pierre-si commented 4 years ago

The reason why "done" is an array instead of a boolean value is that there is a missing call to ravel() in inverted_pendulum.py. "state" should be 1D array not a 2D array. https://github.com/benelot/pybullet-gym/blob/79396fccc1d0e170a092b50137148da48ee2edab/pybulletgym/envs/mujoco/robots/pendula/inverted_pendulum.py#L50-L52 See #47

benelot commented 3 years ago

Somebody made a pull request and fixed this. Thanks for reporting!