Closed Fernadoo closed 3 years ago
SUMO (Simulation of Urban MObility): an open source, highly portable, microscopic and continuous multi-modal traffic simulation package designed to handle large networks.
FLOW: A deep reinforcement learning framework for mixed autonomy traffic developed by Berkeley.
Key assumption:
The original metric is called Time-Average Sample-Average Speed
. Nevertheless, it is manipulable, i.e. naively maximizing this metric may not necessarily improve the traffic efficiency. Thus, the author proposed Outflow
as an alternative metric.
The original metric is called
Time-Average Sample-Average Speed
. Nevertheless, it is manipulable, i.e. naively maximizing this metric may not necessarily improve the traffic efficiency. Thus, the author proposedOutflow
as an alternative metric.
Any possibility to involve mechanism design? In fact, it might only be manipulable from the perspective of altruistic agents. If agents are also self-interested ones just as human drivers, then it becomes a MD problem and the policy can also be trained in a distributed manner.
Project details and demos can be found here https://www.cs.utexas.edu/~aim/flow.html
Centralized Multiagent Driving Policy
The state and action space grow exponentially with the number of controlled vehicles when using a centralized approach.
Agree with the second part but not the first part.
If you are applying neural networks to states, then the exponent issue should be way mitigated since you just concatenate all agents' states or input a raw screenshot. After that, all the operations of the network are linear operations.
The main reason the whole thing gets crazy is that when the action space grows exponentially, the number of labels will exactly grow exponentially and consequently the number of parameters will grow exponentially as well.
Decentralized Multiagent Driving Policy
- Just augments the state vector by several sensed features.
Not sure why they did not do it in the centralized setting.
Lack of comparison with previous methods. More importantly, they did not explain why such a fully distributed policy can be made in this setting instead of previous settings.
http://www.ifaamas.org/Proceedings/aamas2021/pdfs/p386.pdf