Refactor the evaluator to reduce code duplication and reduce memory usage.
Why?
The evaluator was one of the oldest functions in mava and it needed a bit of a rewrite as can be seen in the issues #996 and #1001. It was getting a bit complicated to extend and if you set the number of eval episodes too high it would try to create too many parallel environments.
How?
Have a single evaluator function that can work for recurrent and feed forward systems (and hopefully any future systems) by taking in a acting function that conforms to the API defined in evaluator.py.
Only create num_envs parallel environments and loop for the required iterations so that we do at least num_eval_episodes number of evaluation rollouts.
Closes: #996 and #1001
Also this achieves the goal #1071 so we can close it if this is merged
What?
Refactor the evaluator to reduce code duplication and reduce memory usage.
Why?
The evaluator was one of the oldest functions in mava and it needed a bit of a rewrite as can be seen in the issues #996 and #1001. It was getting a bit complicated to extend and if you set the number of eval episodes too high it would try to create too many parallel environments.
How?
evaluator.py
.num_envs
parallel environments and loop for the required iterations so that we do at leastnum_eval_episodes
number of evaluation rollouts.Closes: #996 and #1001 Also this achieves the goal #1071 so we can close it if this is merged