Farama-Foundation / HighwayEnv

A minimalist environment for decision-making in autonomous driving
https://highway-env.farama.org/
MIT License
2.59k stars 739 forks source link

Graph Reinforcement Learning implementation #561

Closed redfish9 closed 8 months ago

redfish9 commented 8 months ago

Thanks @eleurent for your awsome work!

I'm currently working with this framework and I'm wondering is highway_env compatible of handling multi-agent Graph Reinforcement Learning? In this case, the observation space is the product of the observation matrix O, adjacency matrix A, and mask matrix M.

Moreover, I'm also curious about the implementation of different behavior types of non-agent vehicles. What's your recommended approach for this? I'm considering creating separate classes at /highway_env/vehicle/behavior.py. I would greatly appreciate any tips or advice you could provide.

redfish9 commented 8 months ago

In terms of observation, does the "nearby vehicles" in KinematicObservation imply a similar operation like ajacenct matrix?

eleurent commented 8 months ago

highway_env compatible of handling multi-agent Graph Reinforcement Learning? In this case, the observation space is the product of the observation matrix O, adjacency matrix A, and mask matrix M

It can probably be made compatible, but no as of now there is no support for this specific format

Moreover, I'm also curious about the implementation of different behavior types of non-agent vehicles. What's your recommended approach for this? I'm considering creating separate classes at /highway_env/vehicle/behavior.py

Yes that sounds right.

In terms of observation, does the "nearby vehicles" in KinematicObservation imply a similar operation like ajacenct matrix?

I'm not super familiar with the multi-agent graph exact model you are referring to, but yes it's probably similar as , given an observer vehicle, we iterate through all vehicles and compute some observations (e.g. could be relative like distance, angle, or absolute speed/position). Right now the focus is mostly on single agent scenarios so you get an array of [vehicles, features], and in the simple multi-agent extension which is current implemented you get a tuple such arrays. If every vehicle is an agent, this can be turned into an adjacency matrix/tensor.

redfish9 commented 8 months ago

So grateful for your quick response! Now I'm clear about the logic behind.