Feat/learning rate decay

instadeepai / Mava

🦁 A research-friendly codebase for fast experimentation of multi-agent reinforcement learning in JAX

Apache License 2.0

709 stars 83 forks source link

Closed RuanJohn closed 8 months ago

RuanJohn commented 8 months ago

Add the option to linearly decay the actor and critic learning rates during training.

Based PPO implementation details blog here.

Simple utils file that creates a linear learning rate decay scheduler and passes that to the optimisers.

Default behaviour (constant learning rates) are retained by setting decay_learning_rates: False in the system configs.