libornovax / master_thesis_code

Code for my master thesis: Vehicle Detection and Pose Estimation for Autonomous Driving
MIT License
186 stars 69 forks source link

PR Curves on 2D bounding boxes #29

Closed libornovax closed 7 years ago

libornovax commented 7 years ago

Currently I train on KITTI and evaluate on Jura test short. Here I present a curated list of so far carried out experiments. In order to plot the PR curves I needed to do non-maxima suppression. For now, I programmed the classical NMS with intersection over union threshold 0.5.

I used the basic net in all experiments:

macc_0.25_r2_x4
r2 c0.25
conv k3      o64
conv k3  d2  o64
pool
conv k3      o128
conv k3  d2  o128
pool
conv k3      o256
conv k3  d1  o256
conv k3  d3  o256
conv k3  d7  o256
macc x4

Very first training and test. Basic KITTI labels, changing learning rate. pr_curves

Labels from 3D annotations, uniform learning rate. pr_curves

Higher uniform learning rate, same setup. pr_curves

Changed to a training with larger spread of reference sizes. Also replaced the repeated border with a black one. pr_curves

libornovax commented 7 years ago

Removed trucks from the KITTI training dataset, as well as very occluded and truncated cars. Added flipped images and a new random sampling strategy for selecting bounding boxes. pr_curves

When viewed visually, these results seem very plausible. This result is better than SSD 500x500.

libornovax commented 7 years ago

Gaussian blob

This time I learned the network to predict Gaussian blobs instead of sharp circles. pr_curves

The result is almost the same as in the previous case, however visually seems better. I will use the Gaussian from now on.

libornovax commented 7 years ago

Prediction confidence (error) channel

I created a new channel on the output, which learns to predict the errors of the network probability channel. I then combine this with the probability to get a confidence value of the prediction. (This net does not learn on Gaussians - that will come later) pr_curves

The results on the Jura test short set look superior to the previous, however I am still skeptical about the actual impact of the confidence channel.

libornovax commented 7 years ago

Prediction on 3 scales simultaneously

I trained a network with multiple (3) accumulators macc_0.3_r2_x2_to_x8_s2_kitti, which can detect objects from 23-220px - i.e. it cannot detect objects larger than that. Here I show the performance of the network. On the original Jura test short dataset it is a bit worse than the nets above: pr_jura_test_short

We do not reach precision 1 because the net detects small objects, which are not labeled in the original Jura dataset.

When compared on the new labeled Jura dataset, which includes all very small cars as well it performs probably better - most importantly, the detections are all extracted in one pass and therefore it is way faster! pr_jura_test_short_fix pr_jura_test_short_fix