nikhilbarhate99 / PPO-PyTorch

Minimal implementation of clipped objective Proximal Policy Optimization (PPO) in PyTorch
MIT License
1.67k stars 343 forks source link

The `ppo_continuous.py` model does not learn #5

Closed chingandy closed 5 years ago

chingandy commented 5 years ago

Thank you for sharing your ppo implementation on this repository.

However, I have tried to run your code ppo_continuous.py and I figured that the average reward was not increasing at all. Doesn't that mean the model is not learning?

nikhilbarhate99 commented 5 years ago

Hey, I updated the repo with a bug fix. Try again and let me know.

chingandy commented 5 years ago

Now the model seems to work now. Thank you for your update.