pat-coady / trpo

Trust Region Policy Optimization with TensorFlow and OpenAI Gym
https://learningai.io/projects/2017/07/28/ai-gym-workout.html
MIT License
360 stars 106 forks source link

Help Getting Cart Pole to Run #32

Closed ryanmaxwell96 closed 4 years ago

ryanmaxwell96 commented 4 years ago

When I try to run the CartPole enviornment, I run into this error:

Traceback (most recent call last): File "train.py", line 349, in main(**vars(args)) File "train.py", line 289, in main env, obs_dim, act_dim = init_gym(env_name) File "train.py", line 72, in init_gym act_dim = env.action_space.shape[0] IndexError: tuple index out of range

After a little bit of digging I had to add self.action_space = np.array([1]) to line 53 in cartpole_bullet.py which does solve the error so that the environment can run, but I'm not sure if it is causing some training issues because it won't train after 1000 episodes. Is there an official fix to this problem?

erwincoumans commented 4 years ago

You likely want to use CartPoleContinuousBulletEnv-v0 instead, can you try it? I just ran python3 train CartPoleContinuousBulletEnv-v0 and it trains fine:

python3 train.py CartPoleContinuousBulletEnv-v0

***** Episode 900, Mean R = 187.4 *****
Beta: 1
ExplainedVarNew: -0.926
ExplainedVarOld: -0.916
KL: 0.00559
PolicyEntropy: 0.92
PolicyLoss: -0.00791
Steps: 3.75e+03
ValFuncLoss: 0.062

***** Episode 920, Mean R = 200.0 *****
Beta: 1
ExplainedVarNew: -0.997
ExplainedVarOld: -0.93
KL: 0.00195
PolicyEntropy: 0.92
PolicyLoss: -0.00145
Steps: 4e+03
ValFuncLoss: 0.0654

I'll submit a PR to replace the name to CartPoleContinuousBulletEnv-v0.

ryanmaxwell96 commented 4 years ago

thank you, I had to define act_dim =1 and obs_dim = 4 if env_name=='CartPoleBulletEnv-v1' in init_gym(env_name) function in train.py and that got it to work