This project was inspired by:
I swapped out YOLO v3 for YOLO v4 and added the option for asynchronous processing, which significantly improves the FPS. However, FPS monitoring is disabled when asynchronous processing is used since it isn't accurate.
In addition, I took the algorithm from this paper and implemented it into deep_sort/track.py
.
The original method for confirming tracks was based simply on the number of times an object has been detected without considering detection confidence, leading to high tracking false positive rates when unreliable detections occur (i.e. low confidence true positives or high confidence false positives). The track filtering algorithm reduces this significantly by calculating the average detection confidence over a set number of detections before confirming a track.
See the comparison video below.
Navigate to the appropriate folder to use low confidence track filtering. The above video demonstrates the difference.
See the settings section for parameter instructions.
As you can see in the gif, asynchronous processing has better FPS but causes stuttering.
This code only detects and tracks people, but can be changed to detect other objects by changing lines 103 in yolo.py
. For example, to detect people and cars, change
if predicted_class != 'person':
continue
to
if predicted_class not in ('person', 'car'):
continue
Real-time FPS with video writing:
Turning off tracking gave ~12.5fps with YOLO v4.
YOLO v4 performs much faster and appears to be more stable than YOLO v3. All tests were done using an Nvidia GTX 1070 8gb GPU and an i7-8700k CPU.
Download and convert the Darknet YOLO v4 model to a Keras model by modifying convert.py
accordingly and run:
python convert.py
Then run demo.py:
python demo.py
By default, tracking and video writing is on and asynchronous processing is off. These can be edited in demo.py
by changing:
tracking = True
writeVideo_flag = True
asyncVideo_flag = False
To change target file in demo.py
:
file_path = 'video.webm'
To change output settings in demo.py
:
out = cv2.VideoWriter('output_yolov4.avi', fourcc, 30, (w, h))
This version has the option to hide object detections instead of tracking. The settings in demo.py
are
show_detections = True
writeVideo_flag = True
asyncVideo_flag = False
Setting show_detections = False
will hide object detections and show the average detection confidence and the most commonly detected class for each track.
To modify the average detection threshold, go to deep_sort/tracker.py
and change the adc_threshold
argument on line 40. You can also change the number of steps that the detection confidence will be averaged over by changing n_init
here.
See https://github.com/Ma-Dan/keras-yolo4.
Please note that the tracking model used here is only trained on tracking people, so you'd need to train a model yourself for tracking other objects.
See https://github.com/nwojke/cosine_metric_learning for more details on training your own tracking model.
For those that want to train their own vehicle tracking model, I've created a tool for converting the DETRAC dataset into a trainable format for cosine metric learning and can be found in my object tracking repository here. The tool was created using the earlier mentioned paper as reference with the same parameters.
Navigate to the appropriate folder and run python scripts.
(see requirements.txt)