This video contains results and illustration of challenges encountered during this project:
The project was designed to be modular and reusable. The significant independent domains get their own Class
and an individual file:
main.py
- Main test runner with VehicleDetection
Class and test functions like train_svc_with_color_hog_hist
, test_sliding_window
, train_or_load_model
.utils.py
- Handy utils like imcompare
, warper
, debug
shared across modulessettings.py
- Hyperparameters and Settings shared across modulerolling_statistics.py
- RollingStatistics
class to compute moving_average
and rolling_sum
README.md
- description of the development process (this file)The Pipeline section below has a high level description of the pipeline and pointers to implementation. The code is fairly readable and contains detailed comments that explain their working.
Set Hyperparameters and configurations in settings.py
and run the main.py
python script as shown below. Repository includes all required files and can be used to rerun vehicle detection & tracking on a given video. Refer References below for training dataset.
$ grep 'INPUT\|OUTPUT' -Hn settings.py
settings.py:9: INPUT_VIDEOFILE = 'test_video_output.mp4'
settings.py:11: OUTPUT_DIR = 'output_images/'
$ python main.py
[test_slide_search_window:369] Processing Video: test_video.mp4
[MoviePy] >>>> Building video output_images/test_video_output.mp4
[MoviePy] Writing video output_images/test_video_output.mp4
100%|██████████████████████████████████████████████████████████████████| 39/39 [02:01<00:03, 3.05s/it]
[MoviePy] Done.
[MoviePy] >>>> Video ready: output_images/test_video_output.mp4
$ open output_images/test_video_output.mp4
Basic Data Exploration
Car
and a Road
as a canonical "Vehicle" "Non-Vehicle" class samples for data exploration. Figures in section Pipeline In ActionFeature Extraction - from "Vehicle" and "Non-Vehicle" classes in extract_features_hog
and single_img_features
spatial_features
hist_features
, andhog_features
Training Car Detector - with train_or_load_model
using LinearSVC
in train_svc_with_color_hog_hist
settings.py:L31-L41
the accuracy rose up to 99%joblib
not pickle
. joblib
handles large numpy arrays a lot more efficientlyVehicle Detection - Class
that utilizes region limited sliding window search, heatmap, thresholding, labelling and rolling sum to eventually filter the vehicles.
__init__
- Initializes Instance Variables like Feature Extraction and Sliding Window SearchRolling Statistics
: Moving Average
and Rolling Sum
RollingStatistics
object with a circular queue for saving MEMORY_SIZE
number of previous frames. Leverages Pandas
underneath.rolling_sum
based heatmap accumulates heatmaps from past MEMORY_SIZE
frames and thresholds them together. Thus eliminating one off noisy detections. I experimented with 25+ different MEMORY_SIZE
, ROLLING_SUM_HEAT_THRESHOLD
combinations to come up with a video that was smooth, avoided false positives and was responsive enough to a visible car in the video.rolling_sum
I experimented with moving_average
but soon realized a literal moving average is a very strict thresholding criterion and hence decided to graduate to rolling_sum
which is simpler, more intuitive, lenient and offers a finer thresholding control.sliding_window_search
- Search sliding window
XY_WINDOW
and XY_OVERLAP
were defined as (96, 96) and 70%
respectively. 96px
window size is a fair middle ground that works well to identify cars both near and far and 70% overlap helps cover enough ground to not avoid missing any true positives. It also helps in improving the heat score of a successful detection. Having a single scale makes the algorithm less robust and this could be improved. See caching discussion further down.[400, 656]
for optimization. I did not want to have any X region limit as it is possible to find cars in the left and the right lane of the autonomous car in a general case.🐛
debug, & exception🎇
handling update_overlay
- Sliding Window Search Area Highlighting with
identifier
, anddimensions
of bounding_boxheat_and_threshold
- Computes the heatmaps🔥, labels and bounding boxes with metrics for each labelled boxrolling_sum
- Gets a rolling_sum heatmap from memory
add_to_debugbar
- Insets debug information picture-in-picture or rather video-in-video. Quite professional, you see! 👔It was non-trivial to choose the Hyperparameters. Its primarily been a trial-and-error process. Mohan Karthik's blogpost was precise and served as a good general direction for Color histogram and HOG params. I still experimented with them on my own to determine what works best for me. As mentioned earlier, just the spatial features and channel histograms yielded classifier test accuracy at 90%. I chose HLS
color space as it (or HSV
) yielded great results for lane keeping. By some argument HLS
is more intuitive than HSV
color space. I added HOG features to bump up accuracy to 99%.
It wasn't easy to visualize why the system didn't work for a given frame of video. Using a rolling sum made things even harder. Hence I decided to a few elements to make my life easy:
Current Frame Heatmap
and Rolling Sum Heatmap
Current Frame Heatmap
and THICK GREEN from the Rolling Sum Heatmap
.HeatTh: RollSum * CurFr
, 19 & 1 respectively.Memory
1046
Accuracy
id | width x height
around each box; This will be useful in considering a weighted average (see Enhancements below)Figure: Example of a frame where a shorter bounding box needs to be merged with an adjacent bigger one
Figure: Example of a frame where the current implementation detects a long tail due to long frame memory
There was a tradeoff between long-tail and possibility of not having a car detected. I chose to be conservative and err on the side of having a long-tail.
rolling_statistics.py