nikhilbarhate99 / PPO-PyTorch

Minimal implementation of clipped objective Proximal Policy Optimization (PPO) in PyTorch
MIT License
1.57k stars 332 forks source link

Test results are not good #59

Open 295885025 opened 1 year ago

295885025 commented 1 year ago

Hello! When i finish the training and try to test the model, i find the test results are far away from the training results. Like the training average rewards is 80, the test result may runs 40. Could you please help to explain the difference between the training and test?

Thanks!