AlexeyAB / darknet

YOLOv4 / Scaled-YOLOv4 / YOLO - Neural Networks for Object Detection (Windows and Linux version of Darknet )
http://pjreddie.com/darknet/
Other
21.77k stars 7.96k forks source link

Data augmentation #2289

Open maxwack opened 5 years ago

maxwack commented 5 years ago

Hello @AlexeyAB ,

I have few questions about data augmentation. I read most of the issue on this repository and I know that this project is doing data augmentation based on settings in cfg file.

My problem is that I have a very small data set, around 200 images for 2 classes. The images are medical ultra-sound images so, as you advised in another issue, I set the following in the cfg file : (My images are not X-ray but ultra-sound tho) saturation = 1.05 exposure = 2.5 hue=.01 (https://github.com/AlexeyAB/darknet/issues/1758)

First I want to make clear that I successfully trained the network with this small data set and got decent result (85% of correct classification and localization), but I want to improve this result to 90% or more and was thinking about data augmentation.

1) You advise to have 2000 images for one class. →Do I need to do more data augmentation to reach 2000 images, or automatic augmentation is enough?

2) I tried once, to create 4000 images based on the 200 I had by changing exposure,saturation and rotation.But after doing this and training 4000 epochs (2 classes* 2000), the network can not detect anything anymore. →Is it because I need to train much more because I have much more images? Is it because my data set is getting too different from the images I want to detect which are very close to my original data set? Another reason?

3) I also tried to use 200 images of my data set and 200 unrelated medical images as negative image. For example, I have elbow ultrasound images which I want to detect and I used chest X ray image as negative image. (To do so, I created an empty text file for annotation.) →Same as 2), I got no detection anymore. Any reason you could think of ?

4) Is there anything else I can do to hope having a better result?

Thank you for your help.

AlexeyAB commented 5 years ago

@maxwack Hello,

  1. Usually automatic augmentation is enough. There are no yet only: rotation and vertical flip.

I tried once, to create 4000 images based on the 200 I had by changing exposure,saturation and rotation.But after doing this and training 4000 epochs (2 classes* 2000), the network can not detect anything anymore.

  1. If you can't detect anything after 4000 iterations, even on images from training dataset, and even with threshold -thresh 0.05 then the most likely your dataset is broken. Try to check your 4000 images by using Yolo_mark, will be bounded boxes placed correrctly? https://github.com/AlexeyAB/Yolo_mark Also train with flag -map so you will see accuracy (mAP) during training: https://github.com/AlexeyAB/darknet#when-should-i-stop-training What mAP can you get?

  2. Looks like your new dataset is broken, try to check it using Yolo_mark.

  3. What cfg-file do you use? Can you show screenshot of cloud of points using this command? ./darknet detector calc_anchors data/obj.data -num_of_clusters 9 -width 416 -height 416 -show

maxwack commented 5 years ago

Thank you for your fast reply.

1, 2, 3. Ok I will check my data set again.

  1. I also tried with the 5 layer config file, but I am not sure it has much sense since I don't think what I want to detect is especially small.

Here is the result of the command : clusters

and for the cfg file, I use this:

[net]
# Testing
#batch=1
#subdivisions=1
# Training
batch=64
subdivisions=32
width=416
height=416
channels=3
momentum=0.9
decay=0.0005
angle=0
saturation = 1.05
exposure = 2.5
hue=.01

learning_rate=0.0005
burn_in=2000
max_batches = 4500
policy=steps
steps=3500,4000
scales=.1,.1

[convolutional]
batch_normalize=1
filters=32
size=3
stride=1
pad=1
activation=leaky

# Downsample

[convolutional]
batch_normalize=1
filters=64
size=3
stride=2
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=32
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=64
size=3
stride=1
pad=1
activation=leaky

[shortcut]
from=-3
activation=linear

# Downsample

[convolutional]
batch_normalize=1
filters=128
size=3
stride=2
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=64
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=128
size=3
stride=1
pad=1
activation=leaky

[shortcut]
from=-3
activation=linear

[convolutional]
batch_normalize=1
filters=64
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=128
size=3
stride=1
pad=1
activation=leaky

[shortcut]
from=-3
activation=linear

# Downsample

[convolutional]
batch_normalize=1
filters=256
size=3
stride=2
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=256
size=3
stride=1
pad=1
activation=leaky

[shortcut]
from=-3
activation=linear

[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=256
size=3
stride=1
pad=1
activation=leaky

[shortcut]
from=-3
activation=linear

[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=256
size=3
stride=1
pad=1
activation=leaky

[shortcut]
from=-3
activation=linear

[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=256
size=3
stride=1
pad=1
activation=leaky

[shortcut]
from=-3
activation=linear

[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=256
size=3
stride=1
pad=1
activation=leaky

[shortcut]
from=-3
activation=linear

[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=256
size=3
stride=1
pad=1
activation=leaky

[shortcut]
from=-3
activation=linear

[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=256
size=3
stride=1
pad=1
activation=leaky

[shortcut]
from=-3
activation=linear

[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=256
size=3
stride=1
pad=1
activation=leaky

[shortcut]
from=-3
activation=linear

# Downsample

[convolutional]
batch_normalize=1
filters=512
size=3
stride=2
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=512
size=3
stride=1
pad=1
activation=leaky

[shortcut]
from=-3
activation=linear

[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=512
size=3
stride=1
pad=1
activation=leaky

[shortcut]
from=-3
activation=linear

[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=512
size=3
stride=1
pad=1
activation=leaky

[shortcut]
from=-3
activation=linear

[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=512
size=3
stride=1
pad=1
activation=leaky

[shortcut]
from=-3
activation=linear

[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=512
size=3
stride=1
pad=1
activation=leaky

[shortcut]
from=-3
activation=linear

[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=512
size=3
stride=1
pad=1
activation=leaky

[shortcut]
from=-3
activation=linear

[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=512
size=3
stride=1
pad=1
activation=leaky

[shortcut]
from=-3
activation=linear

[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=512
size=3
stride=1
pad=1
activation=leaky

[shortcut]
from=-3
activation=linear

# Downsample

[convolutional]
batch_normalize=1
filters=1024
size=3
stride=2
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=512
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=1024
size=3
stride=1
pad=1
activation=leaky

[shortcut]
from=-3
activation=linear

[convolutional]
batch_normalize=1
filters=512
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=1024
size=3
stride=1
pad=1
activation=leaky

[shortcut]
from=-3
activation=linear

[convolutional]
batch_normalize=1
filters=512
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=1024
size=3
stride=1
pad=1
activation=leaky

[shortcut]
from=-3
activation=linear

[convolutional]
batch_normalize=1
filters=512
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=1024
size=3
stride=1
pad=1
activation=leaky

[shortcut]
from=-3
activation=linear

######################

[convolutional]
batch_normalize=1
filters=512
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
size=3
stride=1
pad=1
filters=1024
activation=leaky

[convolutional]
batch_normalize=1
filters=512
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
size=3
stride=1
pad=1
filters=1024
activation=leaky

[convolutional]
batch_normalize=1
filters=512
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
size=3
stride=1
pad=1
filters=1024
activation=leaky
stopbackward=1

[convolutional]
size=1
stride=1
pad=1
filters=21
activation=linear

[yolo]
mask = 6,7,8
anchors = 151, 71, 147, 89, 167, 82, 189, 75, 184, 90, 170,113, 214, 91, 199,100, 230,109
classes=2
num=9
jitter=.3
ignore_thresh = .7
truth_thresh = 1
random=1

[route]
layers = -4

[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky

[upsample]
stride=2

[route]
layers = -1, 61

[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
size=3
stride=1
pad=1
filters=512
activation=leaky

[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
size=3
stride=1
pad=1
filters=512
activation=leaky

[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
size=3
stride=1
pad=1
filters=512
activation=leaky

[convolutional]
size=1
stride=1
pad=1
filters=21
activation=linear

[yolo]
mask = 3,4,5
anchors = 151, 71, 147, 89, 167, 82, 189, 75, 184, 90, 170,113, 214, 91, 199,100, 230,109
classes=2
num=9
jitter=.3
ignore_thresh = .7
truth_thresh = 1
random=1

[route]
layers = -4

[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=leaky

[upsample]
stride=2

[route]
layers = -1, 36

[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
size=3
stride=1
pad=1
filters=256
activation=leaky

[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
size=3
stride=1
pad=1
filters=256
activation=leaky

[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
size=3
stride=1
pad=1
filters=256
activation=leaky

[convolutional]
size=1
stride=1
pad=1
filters=21
activation=linear

[yolo]
mask = 0,1,2
anchors = 151, 71, 147, 89, 167, 82, 189, 75, 184, 90, 170,113, 214, 91, 199,100, 230,109
classes=2
num=9
jitter=.3
ignore_thresh = .7
truth_thresh = 1
random=1

Thank you again.

AlexeyAB commented 5 years ago

@maxwack

Also try to train with default anchors.

And with:

learning_rate=0.001
burn_in=1000
max_batches = 6000
policy=steps
steps=5000,5500
scales=.1,.1
maxwack commented 5 years ago

Thank you.

I have 2 GPUs so I usually train for 1000 iterations with the following settings :

learning_rate=0.001
burn_in=1000
max_batches = 1000
policy=steps
steps=500,700
scales=.1,.1

with the command : darknet.exe detector train data/object.data yolov3.cfg darknet53.conv.74

and then train for more iteration with the following settings :

learning_rate=0.0005
burn_in=2000
max_batches = 4500
policy=steps
steps=3500,4000
scales=.1,.1

and with the following command : darknet.exe detector train data/object.data yolov3.cfg backup/yolov3_final.weights -gpus 0,1