Luca96 / carla-driving-rl-agent

Code for the paper "Reinforced Curriculum Learning for Autonomous Driving in CARLA" (ICIP 2021)
MIT License
107 stars 25 forks source link
autonomous-driving carla-driving-simulator deep-learning deep-reinforcement-learning proximal-policy-optimization reinforcement-learning

CARLA Driving RL Agent

A follow up of my master's thesis project involving deep reinforcement learning to train an autonomous driving agent. In particular, the driving agent is trained by using the Proximal Policy Optimization algorithm (PPO) within a simulated driving environment provided by the CARLA simulator (paper). The reinforcement learning phase is organized into increasingly difficult stages, following the idea of Curriculum Learning.

This work has been accepted at the International Conference on Image Processing (ICIP 2021). The conference paper is available here. We have also released an open-access journal version of the paper: you can find it here.

Requirements, installation instructions, and results are listed below.


Requirements

Software:

Hardware (minimum):


Installation

Before running any code from this repo you have to:

  1. Clone this repo: git clone https://github.com/Luca96/carla-driving-rl-agent.git
  2. Download CARLA 0.9.9 from their GitHub repo, here where you can find precompiled binaries which are ready-to-use. Refer to carla-quickstart for more information.
  3. Install CARLA Python bindings in order to be able to manage CARLA from Python code. Open your terminal and type:

    • Windows: cd your-path-to-carla/CARLA_0.9.9.4/WindowsNoEditor/PythonAPI/carla/dist/
    • Linux: cd your-path-to-carla/CARLA_0.9.9.4/PythonAPI/carla/dist/
    • Extract carla-0.9.9-py3.7-XXX-amd64.egg where XXX depends on your OS, e.g. win for Windows.
    • Create a setup.py file within the extracted folder and write the following:

      from distutils.core import setup
      
      setup(name='carla',
          version='0.9.9',
          py_modules=['carla']) 
    • Install via pip: pip install -e ~/CARLA_0.9.9.4/PythonAPI/carla/dist/carla-0.9.9-py3.7-XXX-amd64

Before running the repository's code be sure to start CARLA first:


Examples

Show the agent's network architecture (without running CARLA):

from core import CARLAgent, FakeCARLAEnvironment

agent = CARLAgent(FakeCARLAEnvironment(), batch_size=1, log_mode=None)
agent.summary()

Play with the CARLA environment (requires running CARLA):

from core import CARLAEnv
from rl import CARLAPlayWrapper

# Set `debug=False` if the framerate is very low.
# For better image quality, increase `image_shape` according to your hardware.
env = CARLAEnv(debug=True, window_size=(900, 245), image_shape=(90, 120, 3)) 
CARLAPlayWrapper(env).play()

Reinforcement learning example:

from core import learning

learning.stage_s1(episodes=5, timesteps=256, gamma=0.999, lambda_=0.995, save_every='end', stage_name='stage',
                  seed=42, polyak=0.999, aug_intensity=0.0, repeat_action=1, load_full=False)\
        .run2(epochs=10)

The complete training procedure is shown in main.py. Be aware that each stage can take long time to finish, so comment what you don't need!

NOTE: When loading the agent, e.g. from stage_s1 to stege_s2 be sure to "manually"" copy and rename the saved agent's weights, otherwise use the same stage_name for each stage.


Agent Architecture

The agent leverages the following neural network architecture:

agent_architecture

For more details refer to core/networks.py, in particular to the dynamics_layers function and CARLANetwork class.


Results

All the experiments were run on a machine with:

All agents were evaluated on six metrics (collision rate, similarity, speed, waypoint distance, total reward, and timesteps), two disjoint weather sets (only one used during training), over all CARLA towns (from Town01 to Town10) but only trained on Town03.

Town01, daylight:

agent-performance-town01

Town02, daylight:

agent_town02_day

Town07, evening:

agent_town07_eve

Town07, night:

agent_town07_night

The following table shows the performance of three agents: curriculum (C), standard (S), and untrained (U). The curriculum agent (C) combines PPO with curriculum learning, whereas the standard agent (S) doesn't use any curriculum. Lastly, the untrained agent (U) has the same architecture of the other two but with random weights, so it just provides (non-trivial) baseline performance for comparison purpose. performance table

For detailed results over each evaluation scenario, refer to the extensive evaluation table: src\extensive_evaluation_table.


Cite this Work

If this work is useful for your own research, please cite the paper, and/or mention this repository:

@inproceedings{anzalone2021reinforced,
  title={Reinforced Curriculum Learning For Autonomous Driving In Carla},
  author={Anzalone, Luca and Barra, Silvio and Nappi, Michele},
  booktitle={2021 IEEE International Conference on Image Processing (ICIP)},
  pages={3318--3322},
  year={2021},
  organization={IEEE}
}

Citation for the Journal version:

@article{anzalone2022end,
  title={An End-to-End Curriculum Learning Approach for Autonomous Driving Scenarios},
  author={Anzalone, Luca and Barra, Paola and Barra, Silvio and Castiglione, Aniello and Nappi, Michele},
  journal={IEEE Transactions on Intelligent Transportation Systems},
  year={2022},
  pages={1-10},
  publisher={IEEE},
  doi={10.1109/TITS.2022.3160673}
}