Questions on the episode length of 1000 on gfootball env instead of a maximum env limit of 400

marlbenchmark / on-policy

This is the official implementation of Multi-Agent PPO (MAPPO).

MIT License

1.27k stars 292 forks source link

Dear authors,

Thank you for this work! Could you please address a question that confuses me? I notice that the gfootball env terminates at a maximum of 400 steps as stated in their paper. But I also notice that the training scripts of gfootball set an episode length of 1000. Can you explain your motivation on that? (football scripts e.g. see https://github.com/marlbenchmark/on-policy/blob/b21e0f743bd4516086825318452bb6927a33538d/onpolicy/scripts/train_football_scripts/train_football_ca_hard.sh#L14C16-L14C20)

Best!

marlbenchmark / on-policy