google-deepmind / meltingpot

A suite of test scenarios for multi-agent reinforcement learning.
Apache License 2.0
582 stars 118 forks source link

Qustion about Episode Length Settings in Evaluation Scenarios #254

Closed az2181036 closed 2 months ago

az2181036 commented 2 months ago

Hello,

Firstly, I’d like to express my appreciation for your fantastic work on this gym. It’s been incredibly helpful and insightful.

I have a question regarding the evaluation scenarios: Given that the substrate can have a random length for an episode, are the episode lengths in the evaluation scenario also randomized? Additionally, how do you ensure that the evaluation of different algorithms is conducted under fair and consistent conditions concerning episode length?

Thank you for your assistance.

duenez commented 2 months ago

Yes, the episode lengths are also randomised. We just evaluate for many episodes and take average. It's important that the evaluation is not on a finite horizon, because backwards induction is then possible, which changes the dynamics.