Open DylanCope opened 4 years ago
We are currently working on supporting non-scalar rewards in tf-agents to make this easier. Given this, it's possible to build a meta-agent that initializes several agents. The meta-agent can receive all observations and rewards, split them up appropriately, and invoke the sub-agents. We are currently discussing building support for this. Until then, you could consider passing back the agent's rewards as part of the observation.
What are the ETAs for these features?
Are there other libraries that offer this sort of functionality that can be referenced?
There's any progress?
For anyone looking to do multi-agent RL with TF-Agents, I designed a way of initialising "intrinsically motivated" agents where agents compute their reward as a function of their observations, rather than extracting the reward directly from the environment. This allows each of the agents to have different reward functions.
I've written a tutorial and the repository can be cloned from here.
Is there a suggested way to go about setting up a multiagent RL experiment where the agents have differing reward functions? In
RLlib
there is theMultiAgentEnv
but I do not see any a natural way to usePyEnvironment
with multiple agents.