feat: stop accessing LogEnvState variables directly

instadeepai / Mava

🦁 A research-friendly codebase for fast experimentation of multi-agent reinforcement learning in JAX

Apache License 2.0

709 stars 83 forks source link

What?

Stop accessing LogEnvState variables directly as this forces us to always make the LogEnvWrapper the outermost wrapper - so we can access it's variables through state.episode_return instead of state.env_state.episode_return and if we have an unknown number of wrappers we can't know how deep in the env_state chain we have to go. So instead pass the variables we care about through timestep.extras.

Also changed some file names to make stuff more clear :smile: Also renamed the LogEnvState to bring it more inline with the gym wrapper that does similar things.

Why?

Allows for vault wrapper to also not be the outermost wrapper if we start passing the state info through timestep.extras

instadeepai / Mava

feat: stop accessing LogEnvState variables directly #994

What?

Why?