vwxyzjn / ppo-implementation-details

The source code for the blog post The 37 Implementation Details of Proximal Policy Optimization
https://iclr-blog-track.github.io/2022/03/25/ppo-implementation-details/
Other
637 stars 99 forks source link

Are these arguments correct? Or is it very hard to train MountainCar-v0 by PPO? #7

Open alanyuwenche opened 10 months ago

alanyuwenche commented 10 months ago

I applied the code to train MountainCar-v0 but failed after 10 million timesteps. The command is as follows.

!python ppo.py --gym-id MountainCar-v0 --total-timesteps 10000000

ret

Are these arguments correct? Or is it very hard to train MountainCar-v0 by PPO?