nsidn98 / InforMARL

Code for our paper: Scalable Multi-Agent Reinforcement Learning through Intelligent Information Aggregation
https://nsidn98.github.io/InforMARL/
MIT License
91 stars 22 forks source link

Node observation representation #9

Closed rayidali closed 7 months ago

rayidali commented 8 months ago

Hello, in this project I see that the purpose is to show the effectiveness of the GNN when agents only have the information from their local field of vision (plus the messages). However, I noticed in the navigation_graph.py file that when we form the node_obs and the adj matrix we do so by calculating the distance or relative distance between all entities and the ego agent. I was thinking this was a little contradictory since agents wouldn't know the relative positions of other entities besides their own goal positions so I wanted to ask for some clarification.

So, do you only use this during training or do you follow a similar process for evaluation of the learned policy after training?

Also, I am trying to understand what the numbers in the node_obs represent in the field of vision for the agent and how the adjacency matrix is used of a part of the input to the neural network when agents haven't 'seen' the entities its calculating relative distance for.

https://github.com/nsidn98/InforMARL/blob/304e905d05b34d9bf06046eb7e03904b97a14231/multiagent/custom_scenarios/navigation_graph.py#L473

nsidn98 commented 8 months ago

Hi @rayidali, that is a great question!

The node_obs and adjacency matrix are calculated globally in the environment as you pointed out in the issue above. But when we do the actual processing with the GNNs, we filter out the adjacency matrix based on how far away the entities are with respect to the ego agent. You can refer to this line: https://github.com/nsidn98/InforMARL/blob/304e905d05b34d9bf06046eb7e03904b97a14231/onpolicy/algorithms/utils/gnn.py#L324 The agents are assumed to sense the relative positions about the neighbouring entities within a sensing radius distance around the agents. We fill the node_obs with global information just for reducing the computational costs but while training and testing we do indeed filter out the far-away nodes.

I hope this answers your question!

nsidn98 commented 8 months ago

Hello @rayidali, I was wondering if you still had any further questions regarding this or if I could close this issue?

nsidn98 commented 7 months ago

Closing this due to inactivity. Please reopen if the issue still persists. Thanks!