Fernadoo / Papers_and_Refs

Intersting papers and references
0 stars 0 forks source link

[AAMAS'2021] Scalable Multiagent Driving Policies For Reducing Traffic Congestion #3

Closed Fernadoo closed 3 years ago

Fernadoo commented 3 years ago

http://www.ifaamas.org/Proceedings/aamas2021/pdfs/p386.pdf

Fernadoo commented 3 years ago

SUMO (Simulation of Urban MObility): an open source, highly portable, microscopic and continuous multi-modal traffic simulation package designed to handle large networks.

Fernadoo commented 3 years ago

FLOW: A deep reinforcement learning framework for mixed autonomy traffic developed by Berkeley.

Fernadoo commented 3 years ago

Key assumption:

Fernadoo commented 3 years ago

The original metric is called Time-Average Sample-Average Speed. Nevertheless, it is manipulable, i.e. naively maximizing this metric may not necessarily improve the traffic efficiency. Thus, the author proposed Outflow as an alternative metric.

Fernadoo commented 3 years ago

The original metric is called Time-Average Sample-Average Speed. Nevertheless, it is manipulable, i.e. naively maximizing this metric may not necessarily improve the traffic efficiency. Thus, the author proposed Outflow as an alternative metric.

Any possibility to involve mechanism design? In fact, it might only be manipulable from the perspective of altruistic agents. If agents are also self-interested ones just as human drivers, then it becomes a MD problem and the policy can also be trained in a distributed manner.

Fernadoo commented 3 years ago

Project details and demos can be found here https://www.cs.utexas.edu/~aim/flow.html

Fernadoo commented 3 years ago

Centralized Multiagent Driving Policy

  1. Manufactural feature design instead of a raw screenshot.
  2. The state and action space grow exponentially with the number of controlled vehicles when using a centralized approach. Thus a transferred learning method is also adopted.
Fernadoo commented 3 years ago

The state and action space grow exponentially with the number of controlled vehicles when using a centralized approach.

Agree with the second part but not the first part.

If you are applying neural networks to states, then the exponent issue should be way mitigated since you just concatenate all agents' states or input a raw screenshot. After that, all the operations of the network are linear operations.

The main reason the whole thing gets crazy is that when the action space grows exponentially, the number of labels will exactly grow exponentially and consequently the number of parameters will grow exponentially as well.

Fernadoo commented 3 years ago

Decentralized Multiagent Driving Policy

  1. Just augments the state vector by several sensed features.
  2. Using a combination of selfish and collaborative reward in distributed shared policy training.
Fernadoo commented 3 years ago
  1. Just augments the state vector by several sensed features.

Not sure why they did not do it in the centralized setting.

Fernadoo commented 3 years ago

Lack of comparison with previous methods. More importantly, they did not explain why such a fully distributed policy can be made in this setting instead of previous settings.