Develop new learning algorithm: PPO

assume-framework / assume

ASSUME - Agent-based Simulation for Studying and Understanding Market Evolution

https://assume.readthedocs.io

24 stars 10 forks source link

Develop new learning algorithm: PPO #239

Open nick-harder opened 1 year ago

nick-harder commented 1 year ago

We should start working on a new DRL algorithm based on MA PPO algorithm, it promises significant speed improvements, and would solver the critique of the centralized critic approach

nick-harder commented 5 months ago

This tasks has been given low priority as other issues need to be adressed first

kim-mskw commented 2 weeks ago

The general structure is ready on the PPO branch and runnable with one gradient step. However, the conversion for a single agent seems to get stuck in extreme values, so nothing too valuable is learned.