instadeepai / Mava

🦁 A research-friendly codebase for fast experimentation of multi-agent reinforcement learning in JAX
Apache License 2.0
737 stars 90 forks source link

Feat: Continuous actions networks and training utils #999

Closed WiemKhlifi closed 9 months ago

WiemKhlifi commented 9 months ago

What?

In this PR, we add continuous networks and training functions to transform the log probability and the actions produced. It also modifies configs and the JaxMARL wrapper accordingly, since mabrax doesn't provide a global state.

Why?

These changes add support for environments with continuous actions.

How?

WiemKhlifi commented 9 months ago

We're closing this PR since we'll adopt new ways to initialise the action's distribution in the continuous case with non-vmapped networks and avoid creating different systems files.