Closed fschlatt closed 5 years ago
My only objection is that this runs into the partial information problem pretty quickly. I think if an agent sees another one die, then they should be able to tell their teammate (without using bits). If, however, neither agent on a team sees an enemy agent die, then it would be strange for them to know about this unless they together cover the entire view space.
I don't mind doing this, but please respect the above in your PR.
AFAIK information about which agents are alive are passed to all agents anyway, regardless of if they were able to see the agent die or not. But my request is more out of convenience when generating a replay buffer for training and making the output of the step function more uniform and logical.
Hmm, it gets weird though for the team setting. Is an agent done when it is dead or only when also its teammate has died... I guess the current implementation does make the most sense and you have to define your own definition of when an episode ends for an agent
I'd propose to align the done output from
env.step
with the state and reward outputs and return a list of boolean values, one for every agent. If an agent is no longer alive, done is set toTrue
. Currently if you would like to know if an agent is still alive you need to check it's state and the list of alive agents, which is not really an elegant way for such vital information.I'll create a PR if no one has any objections.