rafaelpadilla / darknet

Useful functionalities added on the original darknet public repository.
Other
36 stars 17 forks source link

YOLO modifications

  1. (Intro) What is Darknet project?
  2. (Intro) What is YOLO
  3. My modifications
    - Testing multiple images
    - Testing multiple thresholds
  4. FAQ

**Warning:** As the files .weights are large files, I am sharing them in my googledrive. The weight needed to run this tutorial can be downloaded [here](https://drive.google.com/file/d/11PDp2P-onqr46mwQGPOxxXiP0Jg98QUY/view?usp=sharing). **Don't forget to copy this weight file to the folder /newdata/**.


Darknet

Darknet is an open source neural network framework written in C and CUDA. It is fast, easy to install, and supports CPU and GPU computation. For more information see the Darknet project website. For questions or issues please use the Google Group

YOLO

YOLO (You Only Live Look Once) is a real-time object detection and classification that obtained excellent results on the Pascal VOC dataset. So far, YOLO has two versions: YOLO V1 and YOLO V2, also refered as Yolo 9000. Click on the image below to watch YOLO 9000's promo video.

The authors have created a website explaining how it works, how to use it and how to train yolo with your images. Check the references below:

YOLO: You Only Look Once: Unified, Readl-Time Object Detection (2016)
(Joseph Redmon, Santosh Divvala, Ross Girshick, Ali Farhadi)
[official site]
[site] [pdf] [slides] [talk] [ted talk]

YOLO9000: Better, Faster, Stronger (2017)
(Joseph Redmon, Ali Farhadi)
[official site]
[site] [pdf] [talk] [slides]

YOLOv3: An Incremental Improvement (2018)
(Joseph Redmon, Ali Farhadi)
[official site]
[site] [pdf]

YOLO: People talking about it
[Andrew NG] [Siraj Raval]

YOLO: People writing about it (Explanations and codes)
[Towards data science]: A brief summary about yolo and how it works.
[Machine Think blog]: A brief summary about yolo and how it works.
[Timebutt's github]: A tutorial explaing how to train yolo 9000 to detect a single class object.
[Timebutt's github]: Read this if you want to understand yolo's training output -> Not everything is correct here. Be careful!
[Cvjena's github]: Comments of some of the tags used in the cfg files.
[Guanghan Ning's blog]: A tutorial explaining how to train yolo v1 with your own data. The author used two classes (yield and stop signs).   [AlexeyAB's github]: Very good project forked from yolo 9000 supporting Windows and Linux.
[Google's Group]: Excellent source of information. People ask and answer doubts about darknet and yolo.
[Guanghan Ning's blog]: Studies and analysis on reducing the running time of Yolo on CPU.
[Guanghan Ning's blog]: Recurrent YOLO. This is an interesting work mixing recurrent network and yolo for object tracking.
[Jonathan Hui]: One of the most detailed and correct explanations about YOLO V2.
[Ayoosh Kathuria]: What’s new in YOLO v3?

My modifications:

Recently I have forked the official darket project and modified it to attend my demands. Below you can find some additional functions I added to the original project.

All the examples can be easily run. You just need to clone or download this repository, compile and run the commands :)

Testing multiple images

Let's say you want to detect objects in a single or multiple images given a network structure and your weights file. Using this function you can also choose to visualize the results (images with bounding boxes) or save your results. You can save the detections (bounding boxes and classes) in .txt files and also save the resulting images.

Another good thing is that you don't need to pass the arguments in a specific order anyomore. This function makes the work easier by accepting the arguments in any order you want.

See the example below to detect multiple images:

./darknet testimages newdata/voc.data newdata/yolo-voc.2.0.cfg newdata/yolo-voc_final.weights -savetxt -saveimg

Arguments:

The output detections will be seen as:

If you add the -saveimg and -savetxt arguments, the results (_dets.txt and .png files) will be created in the results folder specified in your newdata/voc.data file as seen below:

See below the content of an image and its corresponding txt file:

Each line of the _dets.txt file represents a bounding box. The values representing a bounding box are: id confidence relative_center_x relative_center_y relative_width relative height. The id represents the class order of the detected object that appears in the names tag in your newdata/voc.data file. The confidence represents in % how much sure YOLO is of that detection. Remember the threshold in the ./darknet testimages command? This confidence of the detected objects will always be equal or higher than the threshold you set.

But if you want to apply the detector to a single image, you need to add the argument -img followed by the image's path as shown in the example below:

./darknet testimages newdata/voc.data newdata/yolo-voc.2.0.cfg newdata/yolo-voc_final.weights -img newdata/images/000058.jpg -savetxt -saveimg

Add the argument -showimg if you want to visualize the resulting images as soon as the detector evaluates them. (Note: this feature requires openCV compilation. To do so, change the 3rd line of the Makefile to OPENCV=1 and recompile it). Example:

./darknet testimages newdata/voc.data newdata/yolo-voc.2.0.cfg newdata/yolo-voc_final.weights -img newdata/images/000058.jpg -showimg

Remeber that the arguments (file.data, network.cfg, file.weights, etc) do not have to follow an exact order. You can specify them in any position you want. :)

Threrefore the command:

./darknet testimages newdata/voc.data newdata/yolo-voc.2.0.cfg newdata/yolo-voc_final.weights -savetxt -saveimg

is equivalent to:

./darknet testimages newdata/yolo-voc.2.0.cfg newdata/yolo-voc_final.weights newdata/voc.data -saveimg -savetxt

Testing multiple thresholds

Sometimes we need to test multiple images or just a single one with different threshold values.

Suppose you want to test your images with a range of threshold values starting at 30% going up to 100% with steps of 10%. In other words, you will be testing all the following 8 threshold values: 30%, 40%, 50%, 60%, 70%, 80%, 90% and 100%.

You don't need to run the ./darknet testimages 8 times for that and separate your results into folders. You just need to use the argument -thresh informing the initial threshold, incremental step and the final threshold.

The example below tests many threshold values (30%, 40%, 50%, 60%, 70%, 80%, 90% and 100%) on the image 000058.jpg.

./darknet testimages newdata/voc.data newdata/yolo-voc.2.0.cfg newdata/yolo-voc_final.weights -img newdata/images/000058.jpg -savetxt -saveimg -thresh .30,.10,1

Attention: The 3 values after the -thresh argument must be separated by comma. That's how we read the argument -thresh .30,.10,1: Test threshold values starting at 30% (0.30), then an increment of 10% will be added (0.10) until it reaches 100% (1).

If somehow your steps reach a value higher than the final threshold, it won't be considered. Thus:

You could also use the command ./darknet testimages with the tag -thresh to test multiple thresholds for multiple images as seen below:

./darknet testimages newdata/voc.data newdata/yolo-voc.2.0.cfg newdata/yolo-voc_final.weights -savetxt -saveimg -thresh .45,.45,1

As already presented in this tutorial, the paths of images to be evaluated must be listed in a .txt file identified in the newdata/voc.data with the test tag. The output files (_dets.txt and .png files) will be generated in the results folder specified in your newdata/voc.data.

Because we are testing many thresholds, folders identifying each threshold will be created. All your output files will be added in their respective folder. The image below shows an example of the threshold folder structure created by adding the argument -thresh .20,.10,1:

Of course, the command ./darknet testimages also supports the -thresh argument with only one threshold. The example below shows how to test your images using a single threshold of 75%:

./darknet testimages newdata/voc.data newdata/yolo-voc.2.0.cfg newdata/yolo-voc_final.weights -savetxt -saveimg -thresh .75

FAQ YOLO

Question 1: What do those values mean during training?

Answer: During training the samples are divided into batches and the batches are grouped into subdivisions, which are set in the .cfg file. While darknet is training YOLO, statistics of the training are presented as shown in the image below:

The highlighted represents the training for a batch. In this example, each batch contains 64 images divided into 8 subdivisions. Thus, for this particular case each subdivision contains 8 images. Those values represent:

The last line gives statistics of the whole batch: