Closed rebcabin closed 6 years ago
Don't use that cartpoly.py code, use the Gym InvertedPendulumBulletEnv-v0 instead, implemented in https://github.com/bulletphysics/bullet3/blob/master/examples/pybullet/gym/pybullet_envs/gym_pendulum_envs.py
There is training using TF Agents. A video of the double pendulum is here: https://www.youtube.com/watch?v=pcimpGkI8OQ
See also the Reinforcement Learning section of the pybullet quickstart guide, it described the various environments (cartpoly.py is some test code of a colleague of mine, I'll remove it to avoid confusion).
Great. Thanks!
https://github.com/bulletphysics/bullet3/blob/7d7f5ee7d4787f632446dbef6422a18ac814255c/examples/pybullet/gym/pybullet_envs/bullet/cartpole_bullet.py#L69
I get
on this line when running
I have not been able to make it work under Python 3 due to environment-setup issues on my computer.
action
appears to be always an array of length 1 containing a single floating-point value between -1 and 1 (inclusive?). I tried mappingaction[0]
to[0..8]
, the index domain of the array thataction
is trying to index. That is, I tried (a guess) that the index expression should bebut this is so far from what is written that I have no confidence in my guess. Plus the reinforcement learning does not converge, So I am not sure what the author of this code was trying to accomplish. Meanwhile, I looked at the cited, original C code from Sutton & Barto, but it's nothing like the code here and was no help with my reverse engineering.
All that aside, I claim the code does not run, as written, under Python 2.7.13.