bulletphysics / bullet3

Bullet Physics SDK: real-time collision detection and multi-physics simulation for VR, games, visual effects, robotics, machine learning etc.
http://bulletphysics.org
Other
12.2k stars 2.85k forks source link

Converting action to deltav errors? #1372

Closed rebcabin closed 6 years ago

rebcabin commented 6 years ago

https://github.com/bulletphysics/bullet3/blob/7d7f5ee7d4787f632446dbef6422a18ac814255c/examples/pybullet/gym/pybullet_envs/bullet/cartpole_bullet.py#L69

I get

TypeError: only integer scalar arrays can be converted to a scalar index

on this line when running

python2 cartpole_bullet_gym_example.py

I have not been able to make it work under Python 3 due to environment-setup issues on my computer.

action appears to be always an array of length 1 containing a single floating-point value between -1 and 1 (inclusive?). I tried mapping action[0] to [0..8], the index domain of the array that action is trying to index. That is, I tried (a guess) that the index expression should be

int (4 * (action[0] + 1))

but this is so far from what is written that I have no confidence in my guess. Plus the reinforcement learning does not converge, So I am not sure what the author of this code was trying to accomplish. Meanwhile, I looked at the cited, original C code from Sutton & Barto, but it's nothing like the code here and was no help with my reverse engineering.

All that aside, I claim the code does not run, as written, under Python 2.7.13.

erwincoumans commented 6 years ago

Don't use that cartpoly.py code, use the Gym InvertedPendulumBulletEnv-v0 instead, implemented in https://github.com/bulletphysics/bullet3/blob/master/examples/pybullet/gym/pybullet_envs/gym_pendulum_envs.py

There is training using TF Agents. A video of the double pendulum is here: https://www.youtube.com/watch?v=pcimpGkI8OQ

See also the Reinforcement Learning section of the pybullet quickstart guide, it described the various environments (cartpoly.py is some test code of a colleague of mine, I'll remove it to avoid confusion).

rebcabin commented 6 years ago

Great. Thanks!