facebookresearch / ReAgent

A platform for Reasoning systems (Reinforcement Learning, Contextual Bandits, etc.)
https://reagent.ai
BSD 3-Clause "New" or "Revised" License
3.58k stars 521 forks source link

Current plans or progress for TRPO and PPO #213

Open balloch opened 4 years ago

balloch commented 4 years ago

Im just curious if there is any effort to add these policy gradient methods ?

czxttkl commented 4 years ago

PPO is on our TODO list but has a low priority awaiting for someone to pick up.