Closed inconst closed 5 years ago
Hi @inconst
I am not sure I understand your question fully. In our definitions, an environment can contain multiple agents. When calling env.reset
the environment (an all of the agents it contains) will reset. When the Academy of an environment is Done
then the Python env
will no longer be able to step and env.reset
must be called to continue the simulation.
In the case of multi-agent setup, there is still a single environment.
Hi @vincentpierre
So whenever any of agents finish its episode, Done
is called, right? But other agents may need more time to finish theirs, so they can't be reset yet because env.reset() will reset all agents. Is this handled somehow?
There are 2 levels of Done flags : One on the Agents and one on the Environment / Academy. When the Environment / Academy is Done, everything must reset : Academy and Agents alike. When an Agent is done, it is labeled as Done and the simulation can go on. (The Environment is no Done and can proceed)
This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.
Documentation for env.step() says that
But how can I reset separate environments in case of multi-agent setup?