ratulKabir/Custom-Object-Detection-using-Darkflow

Intro

While learning YOLO I have gone through a lot of blogs, github codes, blogs, courses. I have tried to combine all of them and see how to work with my own dataset set.

I have used Anaconda and jupyter notebook. Here I have used Darkflow to detect custom object.
Also I use Windows. Therefore all my tips are likely to run well on Windows.

Requirements

Python3, tensorflow 1.0, numpy, opencv 3. Links for installation below:

Python 3.5 or 3.6, Anaconda
Tensorflow. I recommend using the tensorflow GPU version. But if you don't have GPU, just go ahead and install the CPU versoin.
GPUs are more than 100x faster for training and testing neural networks than a CPU. Find more here
Opencv

Download the Darkflow repo
Click this
Download and extract the files somewhere locally

Getting started

You can choose one of the following three ways to get started with darkflow. If you are using Python 3 on windows you will need to install Microsoft Visual C++ 14.0. Here you can find installation process, why it is required, references etc or you can try stackoverflow.

Just build the Cython extensions in place. NOTE: If installing this way you will have to use ./flow in the cloned darkflow directory instead of flow as darkflow is not installed globally.
```
python3 setup.py build_ext --inplace
```
Let pip install darkflow globally in dev mode (still globally accessible, but changes to the code immediately take effect)
```
pip install -e .
```
Install with pip globally
```
pip install .
```

Download a weights file

Download the YOLOv2 608x608 weights file here
Read more about YOLO (in darknet) and download weight files here. In case the weight file cannot be found, you can check here, which include yolo-full and yolo-tiny of v1.0, tiny-yolo-v1.1 of v1.1 and yolo, tiny-yolo-voc of v2. Owner of this weights is Trieu.
NOTE: there are other weights files you can try if you like
create a wights folder within the darkflow-master folder
put the weights file in the weights folder

Make own Dataset

I have run the model on around 250 images. I recommend to have a much bigger dataset for better performance.

Dataset

To make a dataset of objects around you

start taking photos of the objects that you want to detect.
make sure have pictures from different angles, different poses, in different environment etc.
try to make the dataset as big as possible for better performance.

Annotation

To annotate images download labelImg.
Check this video to learn how to use lebelImg.
Github repo for labelImg can be found here

Training on your own dataset

The steps below assume we want to use tiny YOLO and our dataset has 3 classes

Create a copy of the configuration file tiny-yolo-voc.cfg and rename it according to your preference tiny-yolo-voc-3c.cfg (It is crucial that you leave the original tiny-yolo-voc.cfg file unchanged, see below for explanation). Here tiny-yolo-voc-3c.cfg is for 3 classes, you can change the name as you wish.
In tiny-yolo-voc-3c.cfg, change classes in the [region] layer (the last layer) to the number of classes you are going to train for. In our case, classes are set to 3.
```
...

[region]
anchors = 1.08,1.19,  3.42,4.41,  6.63,11.38,  9.42,5.11,  16.62,10.52
bias_match=1
classes=3  ## 3 classes
coords=4
num=5
softmax=1

...
```
In tiny-yolo-voc-3c.cfg, change filters in the [convolutional] layer (the second to last layer) to num (classes + 5). In our case, num is 5 and classes are 3 so 5 (3 + 5) = 40 therefore filters are set to 40.
```
...

[convolutional]
size=1
stride=1
pad=1
filters=40  ## 5 * (3 + 5) = 40
activation=linear

[region]
anchors = 1.08,1.19,  3.42,4.41,  6.63,11.38,  9.42,5.11,  16.62,10.52

...
```
Change labels.txt to include the label(s) you want to train on (number of labels should be the same as the number of classes you set in tiny-yolo-voc-3c.cfg file). In my case, labels.txt will contain 3 labels.
```
king
ace
ten
```
Reference the tiny-yolo-voc-3c.cfg model when you train.

python flow --model cfg/tiny-yolo-voc-3c.cfg --load weights/tiny-yolo-voc.weights --train --annotation train/Annotations --dataset train/Images --gpu 1.0 --epochs 300

In windows you need to type python at the beginning otherwise it does not recognise the flow command. Next spesify the model --model cfg/tiny-yolo-voc-3c.cfg and the weights --load weights/tiny-yolo-voc.weights. After that specify the path for the annatations --annotation train/Annotations and images --dataset train/Images. Use --gpu 1.0 to use gpu for speed, if you do not have GPU just don't use this part. You can specify the number of epochs. By default it is 1000. However it can be stopped anytime. I recommend to keep the lose below 1.

Why should I leave the original tiny-yolo-voc.cfg file unchanged?

When darkflow sees you are loading tiny-yolo-voc.weights it will look for tiny-yolo-voc.cfg in your cfg/ folder and compare that configuration file to the new one you have set with --model cfg/tiny-yolo-voc-3c.cfg. In this case, every layer will have the same exact number of weights except for the last two, so it will load the weights into all layers up to the last two because they now contain different number of weights.

Object Detection using YOLO

Open the object-detection-with-YOLO.ipynb file. I have tried to add comments to make it easy to understand.

Image

To detect object from images:

Go to the Object Detection from Image section.
Change the image name with your image name from the following line
img = cv2.imread('images/img_2386.jpg', cv2.IMREAD_COLOR)
If you have multiple object in your image then you have to define all the tl (Top left), br (Bottom right) for different ofjects and their labels.

Video

To detect object from video:

Go to the Object Detection from Video section.
Change the image name with your image name from the following line
capture = cv2.VideoCapture('test2.mkv')
Run.
Press Q to quit

Webcam

To detect object from webcam just run the code from Object Detection from Webcam section. If you have multiple webcams you may need to specify the number correctly for your desired webcam. I have my laptops default webcam. Thats why I have used 0. To change the nummber edit this line
capture = cv2.VideoCapture(0)

Press Q to quit

My webcam results are below below.

My confidence factor is low because of lack of data (about 250 images) and having no GPU. I had to stop training after 60 epochs. It took 9 hours and the lose was around 3.8. I was just trying to learn things so that was enough for me.

References

Real-time object detection and classification. Paper: version 1, version 2.
Official YOLO website.
I have learned YOLO, how it works from coursera. Also Siraj has a nice tutorial on it.
The original darkflow repo is this by Trieu.
To have video description of the codes and more understanding follow this videos. I have followed Mark Jay a lot whil making this project.