AndreaVidali / Deep-QLearning-Agent-for-Traffic-Signal-Control

A framework where a deep Q-Learning Reinforcement Learning agent tries to choose the correct traffic light phase at an intersection to maximize traffic efficiency.
MIT License
424 stars 149 forks source link
deep-reinforcement-learning q-learning sumo traffic traffic-light-controller traffic-signal

Deep Q-Learning Agent for Traffic Signal Control

A framework where a deep Q-Learning Reinforcement Learning agent tries to choose the correct traffic light phase at an intersection to maximize traffic efficiency.

I have uploaded this here to help anyone searching for a good starting point for deep reinforcement learning with SUMO. This code is extracted from my master thesis, and it represents a simplified version of the code used for my thesis work. I hope you can find this repository useful for your project.

Getting Started

These instructions will get you a copy of the project up and running on your local machine. In my opinion, the following are the easiest steps in order to run the algorithm from scratch, with the least amount of effort. A computer with an NVIDIA GPU is strongly recommended.

  1. Download Anaconda (official site) and install.
  2. Download SUMO (official site) and install.
  3. Follow this short guide to install tensorflow-gpu correctly and problem-free. In short, the guide tells you to open Anaconda Prompt, or any terminal, and type the following commands:
    conda create --name tf_gpu
    activate tf_gpu
    conda install tensorflow-gpu

I've used the following software versions: Python 3.7, SUMO traffic simulator 1.2.0, tensorflow 2.0

Running the algorithm

  1. Clone or download the repo.
  2. Using the Anaconda prompt or any other terminal, navigate to the root folder and run the file training_main.py by executing:
    python training_main.py

Now the agent should start the training.

You don't need to open any SUMO software since everything is loaded and done in the background. If you want to see the training process as it goes, you need to set to True the parameter gui contained in the file training_settings.ini. Keep in mind that viewing the simulation is very slow compared to the background training, and you also need to close SUMO-GUI every time an episode ends, which is not practical.

The file training_settings.ini contains all the different parameters used by the agent in the simulation. The default parameters aren't greatly optimized, so a bit of testing will likely increase the algorithm's current performance.

When the training ends, the results will be stored in "./model/model_x/" where x is an increasing integer starting from 1, generated automatically. Results will include some graphs, the data used to create the graphs, the trained neural network, and a copy of the ini file where the agent settings are.

Now you can finally test the trained agent. To do so, you have to run the file testing_main.py. The test involves a single episode of simulation, and the results of the test will be stored in "./model/model_x/test/" where x is the number of the model that you specified to test. The number of the model to test and other useful parameters are contained in the file testing_settings.ini.

Training time: ~27 seconds per episode, 45min for 100 episodes, on a computer equipped with i7-3770K, 8GB RAM, NVIDIA GTX 970, SSD.

The code structure

The main file is training_main.py. It handles the main loop that starts an episode on every iteration. It also saves the network weights and three plots: negative reward, cumulative wait time, and average queues.

Overall the algorithm is divided into classes that handle different parts of the training.

In the "intersection" folder, there is a file called environment.net.xml, which defines the environment's structure, and it was created using SUMO NetEdit. The other file sumo_config.sumocfg it is a linker between the environment file and the route file.

The settings explained

The settings used during the training and contained in the file training_settings.ini are the following:

The settings used during the testing and contained in the file testing_settings.ini are the following (some of them have to be the same as the ones used in the relative training):

The Deep Q-Learning Agent

Framework: Q-Learning with deep neural network.

Context: traffic signal control of 1 intersection.

Environment: a 4-way intersection with 4 incoming lanes and 4 outgoing lanes per arm. Each arm is 750 meters long. Each incoming lane defines the possible directions that a car can follow: left-most lane dedicated to left-turn only; right-most lane dedicated to right-turn and straight; two middle lanes dedicated to only going straight. The layout of the traffic light system is as follows: the left-most lane has a dedicated traffic-light, while the other three lanes share the same traffic light.

Traffic generation: For every episode, 1000 cars are created. The car arrival timing is defined according to a Weibull distribution with shape 2 (a rapid increase of arrival until the mid-episode, then slow decreasing). 75% of vehicles spawned will go straight, 25% will turn left or right. Every vehicle has the same probability of being spawned at the beginning of every arm. In every episode, the cars are randomly generated, so it is impossible to have two equivalent episodes regarding the vehicle's arrival layout.

Agent ( Traffic Signal Control System - TLCS):

Changelog - New version, updated on 12 Jan 2020

Author

If you need further information about the algorithm, I suggest you open an issue on the issues page.

License

This project is licensed under the MIT License - see the LICENSE.md file for details

Buy me a coffee!

Hi 👋 My name is Andrea.

If this repo helped you in some way and you want to say thanks, consider buying me a coffee!

Buy Me A Coffee