RL algorithms and their inputs

proroklab / VectorizedMultiAgentSimulator

VMAS is a vectorized differentiable simulator designed for efficient Multi-Agent Reinforcement Learning benchmarking. It is comprised of a vectorized 2D physics engine written in PyTorch and a set of challenging multi-robot scenarios. Additional scenarios can be implemented through a simple and modular interface.

GNU General Public License v3.0

350 stars 71 forks source link

Hello,

Thanks for the interest!

CPPO is the default rllib PPO which is treating all agents as one. IPPO and MAPPO are done using our custom trainer available at https://github.com/proroklab/rllib_differentiable_comms. You can see https://github.com/proroklab/HetGPPO to see how to use the trainer in vmas.

In each scenario, there is an observation function that returns the observations for each agent.

It has the form:

def observation(agent):
  return torch.cat([agent.state.pos, agent.state.vel, ...])

If you want pixels you can return pixels in there or anything you want. In case you want global observation for each agent you can return the same global obs for each agent.

Communication of observation is not part of the paper. But if you are curious how we do comms is via a GNN. So we return local observations in the function such as position. and then in the neural network we build the agent graph which determines which neigbhour information you can aggregate over.

proroklab / VectorizedMultiAgentSimulator

RL algorithms and their inputs #36