fix: final return value for SAC systems

instadeepai / Mava

🦁 A research-friendly codebase for fast experimentation of multi-agent reinforcement learning in JAX

Apache License 2.0

737 stars 90 forks source link

Closed sash-a closed 8 months ago

sash-a commented 8 months ago

Fix the final return value of the SAC systems to bring them inline with PPO for hydra sweeps