RenatoGeh / godrive

GoDrive is an application of autonomous driving through image classification using sum-product networks.
BSD 3-Clause "New" or "Revised" License
1 stars 0 forks source link
autonomous-driving gospn image-classification mobile-robots probabilistic-graphical-models spn sum-product-networks

GoDrive

Description

GoDrive is a lane following autonomous driving robot implementation that uses image classification as a form of imitation learning. sum-product networks (SPNs) are used as a way to compute exact inference in linear time, allowing for accurate and fast uncertainty measuring in real-time.

This code is part of my undergraduate thesis Mobile Robot Self-Driving Through Image Classification Using Discriminative Learning of Sum-Product Networks. The full final thesis can be read here. Both prediction and training of SPNs are done through the GoSPN library.

Objectives

The primary objectives of this implementation are twofold: both as a comparative study on different SPN architectures and learning methods, and also as a preliminar work on SPNs for self-driving and their feasibility as a real-time prediction model. A third secondary objective was to compare SPNs with state-of-the-art multilayer perceptrons (MLPs) and convolutional neural-networks (CNNs).

Lane following as self-driving

Ours is a primitive approach to self-driving, mainly that of lane following through imitation learning. The robot's objective is to remain inside a designated lane whilst still moving forward. Since the robot is not allowed to stop, constantly moving forward, the prediction model must be both accurate - identifying lane markings and making the necessary heading corrections - and fast.

The robot was allowed to execute three different operations: go forward, turn left, or turn right. Prediction output encoded these three commands as a single byte.

Hardware

We experimented with the Lego Mindstorm NXT, nicknamed Brick. A Raspberry Pi Model 3, nicknamed Berry, was attached to the bot, together with a low cost webcam. The Berry was then used for image capturing, processing, and label prediction, sending the predicted label to the Brick, who was only tasked with executing corresponding motor commands.

Prediction

Prediction was done in real-time. The implementation contained in this repository takes advantage of the Berry's four CPU cores. Three cores are dedicated to computing each label concurrently. The fourth core is used for image capturing, processing and sending the predicted byte to the Brick.

Training

Training was done separately on a desktop computer. You can read more about training and validation here

Results

Results are available both in the thesis and also in video.

Code structure

The code is structured as follows: