facebookresearch / BenchMARL

A collection of MARL benchmarks based on TorchRL
https://benchmarl.readthedocs.io/
MIT License
292 stars 42 forks source link

[BugFix] More flexible episode_reward computation in logger #136

Closed matteobettini closed 1 month ago

matteobettini commented 1 month ago

This PR fixes the way episode rewards are computed in BenchMARL

Here is an overview:

epispde_reward compuation

BenchMARL will be looking at the global done (always assumed to be set), which can usually be computed using any or all over the single agents dones.

In all cases the global done is what is used to compute the episode reward.

We log episode_reward min, mean, max over episodes at three different levels:

Requiremment

When agents are done and the global done is not set, agents should be getting a reward of 0 (if you are not using global rewards)

Fixes #135