proroklab / VectorizedMultiAgentSimulator

VMAS is a vectorized differentiable simulator designed for efficient Multi-Agent Reinforcement Learning benchmarking. It is comprised of a vectorized 2D physics engine written in PyTorch and a set of challenging multi-robot scenarios. Additional scenarios can be implemented through a simple and modular interface.
https://vmas.readthedocs.io
GNU General Public License v3.0
340 stars 68 forks source link

MPE - Simple Reference Reward Question #149

Open fourpenny opened 5 hours ago

fourpenny commented 5 hours ago

The reward function for agents in the VMAS version of the Simple Reference scenario from the MPE differs from the current implementation in Petting Zoo.

Petting Zoo either penalizes individual agents for their distance from their corresponding landmark or returns an average of these penalties across all agents.

In VMAS, the reward is calculated only based on the position of the first agent in the environment's distance from landmarks.

Is this difference intentional? It seems like this implementation would make it difficult for both agents to learn to approach their corresponding landmark.

matteobettini commented 4 hours ago

Hello,

Have you read this? https://github.com/proroklab/VectorizedMultiAgentSimulator/issues/62#issuecomment-1781214094

It should have the answer

Vmas follows the logic of the original MPE repository while PettingZoo changed a few things

also i think you misunderstood the vmas code: We are not using the position of the first agent only, we are just computing the reward on the first call of the function