instadeepai / Mava

🦁 A research-friendly codebase for fast experimentation of multi-agent reinforcement learning in JAX
Apache License 2.0
737 stars 90 forks source link

Feat/sebulba ippo #959

Closed OmaymaMahjoub closed 10 months ago

OmaymaMahjoub commented 11 months ago

What?

Implement Sebulba architecture with feedforward IPPO on Rware.

Why?

Integrate Sebulba's architecture due to its effectiveness in scenarios involving non-jitted/non-jax environments.

How?

Enhance the existing Cleanba code to support marl algorithm and ensure compatibility with Mava's key components, including logger, evaluator, etc.

Extra

Action Item:

OmaymaMahjoub commented 10 months ago

Many changes happened to deveop since the last update of this PR that will result in a massive merge conflict, therefore we need to create a new branch and new PR for the sebulba ippo!