In PPO.ipynb, the position of action loss epoch and value loss epoch need to be swapped.

qfettes / DeepRL-Tutorials

Contains high quality implementations of Deep Reinforcement Learning algorithms written in PyTorch

1.06k stars 323 forks source link

In PPO.ipynb, the position of action loss epoch and value loss epoch need to be swapped. #10

Open wadx2019 opened 3 years ago

wadx2019 commented 3 years ago

In PPO.ipynb, the position of action loss epoch and value loss epoch need to be swapped and I suggest that you'd better use RMSprop as the optimizer and reduce the learning rate to make these RL model easier to converge.