Feat: Full support for continuous actions case

instadeepai / Mava

🦁 A research-friendly codebase for fast experimentation of multi-agent reinforcement learning in JAX

Apache License 2.0

709 stars 83 forks source link

Feat: Full support for continuous actions case #1000

Closed WiemKhlifi closed 7 months ago

WiemKhlifi commented 8 months ago

What?

This PR adds full support for continuous actions in all Mava PPO systems.

Why?

Add support for environments with continuous actions.

How?

Add continuous PPO systems using the networks and utils functions introduced in PR #999.
Add all systems and networks config.
Modify the evaluator to include transformations made on selected actions by the actor_policy.

Extra

This PR is the continuation of this #999 so it should be reviewed first 🙌

WiemKhlifi commented 7 months ago

We're closing this PR since we'll adopt new ways to initialise the action's distribution in the continuous case with non-vmapped networks and avoid creating different systems files.