cogment / cogment-verse

Research platform for Human-in-the-loop learning (HILL) & Multi-Agent Reinforcement Learning (MARL)
https://cogment.ai/cogment_verse
Apache License 2.0
80 stars 15 forks source link

Debug/Benchmark Async PPO implementation #177

Closed cloderic closed 1 year ago

lhnguyen102 commented 1 year ago

Here is the performance of APPO on hopper-v4. You also can find the hopper benchmark in [1]. reward_comp_hopper

[1] Figure 3, Proximal Policy Optimization Algorithms (Schulman et al., 2017)