PPO Algorithm Convergence Issue: Ladder Degradation Problem

ericyangyu / PPO-for-Beginners

A simple and well styled PPO implementation. Based on my Medium series: https://medium.com/@eyyu/coding-ppo-from-scratch-with-pytorch-part-1-4-613dfc1b14c8.

MIT License

769 stars 116 forks source link

PPO Algorithm Convergence Issue: Ladder Degradation Problem #17

Open heping103 opened 2 months ago

heping103 commented 2 months ago

屏幕截图 2024-09-12 102525 I customized an environment and trained it with the PPO algorithm，Why does my strategy suddenly collapse as the model is trained？ Is this a problem with my environment? Or is it a common problem in reinforcement learning? How do I fix it？Thank you for your teaching and look forward to receiving a response。

ericyangyu commented 1 month ago

Hi, thanks for reaching out. This can be many several reasons off the top of my head, but I cannot say much unless I know more about the task you want to train on. I know it's been a few weeks since you've posted this but if you still have questions on this, feel free to email me!