instadeepai / Mava

🦁 A research-friendly codebase for fast experimentation of multi-agent reinforcement learning in JAX
Apache License 2.0
737 stars 90 forks source link

fix: final return value for SAC systems #1066

Closed sash-a closed 8 months ago

sash-a commented 8 months ago

What?

Fix the final return value of the SAC systems to bring them inline with PPO for hydra sweeps