AlexeyAB / darknet

YOLOv4 / Scaled-YOLOv4 / YOLO - Neural Networks for Object Detection (Windows and Linux version of Darknet )
http://pjreddie.com/darknet/
Other
21.68k stars 7.96k forks source link

loss not drop when training custom data with tiny-yolo after 3000 iterations #484

Open wureka opened 6 years ago

wureka commented 6 years ago

My custom images info: image size: w x h = 800 x 480, channels: 3

class count: 7 classes training image count: 559 images valid image count: 143 images My tiny-yolo cfg:

batch=32
subdivisions=8
width=800
height=480
channels=3
momentum=0.9
decay=0.0005
angle=0
saturation = 1.5
exposure = 1.5
hue=.1

learning_rate=0.001
max_batches = 40200
policy=steps
steps=-1,100,20000,30000
scales=.1,10,.1,.1

[convolutional]
batch_normalize=1
filters=16
size=3
stride=1
pad=1
activation=leaky

[maxpool]
size=2
stride=2

[convolutional]
batch_normalize=1
filters=32
size=3
stride=1
pad=1
activation=leaky

[maxpool]
size=2
stride=2

[convolutional]
batch_normalize=1
filters=64
size=3
stride=1
pad=1
activation=leaky

[maxpool]
size=2
stride=2

[convolutional]
batch_normalize=1
filters=128
size=3
stride=1
pad=1
activation=leaky

[maxpool]
size=2
stride=2

[convolutional]
batch_normalize=1
filters=256
size=3
stride=1
pad=1
activation=leaky

[maxpool]
size=2
stride=2

[convolutional]
batch_normalize=1
filters=512
size=3
stride=1
pad=1
activation=leaky

[maxpool]
size=2
stride=1

[convolutional]
batch_normalize=1
filters=1024
size=3
stride=1
pad=1
activation=leaky

###########

[convolutional]
batch_normalize=1
size=3
stride=1
pad=1
filters=1024
activation=leaky

[convolutional]
size=1
stride=1
pad=1
filters=60
activation=linear

[region]
anchors = 1.08,1.19,  3.42,4.41,  6.63,11.38,  9.42,5.11,  16.62,10.52
bias_match=1
classes=7
coords=4
num=5
softmax=1
jitter=.2
rescore=1

object_scale=5
noobject_scale=1
class_scale=1
coord_scale=1

absolute=1
thresh = .6
random=1
small_object=0

After 2000 iterations, I feel that the loss does not drop anymore. it just moves between 0.1 and 0.2 and the Avg IOU is between 0.7 and 0.85 Loss vs Avg IOU chart

How to let the loss go below 0.1 ?

sivagnanamn commented 6 years ago

Training loss is dependent upon your training data & the model that you're currently using. With the same model, you might end up with a different loss if you're training data is changed (vice-versa).

First check the performance of your model to decide further steps.

However, you can try the following to reduce training loss:

wureka commented 6 years ago

@sivagnanamn Are you saying that the image size must be square (height and width are equals)? My images size are rectangle (800x480), but each height and width are also multiples of 32. Is that OK?

And, I use the default tiny-yolo-voc.cfg. I only change batch, width, height, classes, filters.

The default weight is got following by the default instructions, which should be your so-called pre-trained weights. OK, then the rest is the anchors. I executed python2 gen_archors -filelist train.txt and it generate 10 anchor files. Which anchor file should I use to replace the anchors in tiny-yolo-voc.cfg ? Thanks

sivagnanamn commented 6 years ago

You can use the command below to generate custom anchors for your dataset.

./darknet detector calc_anchors obj.data -num_of_clusters 5 -final_width 13 -final_heigh 13

for 416x416 input image --> final_width 13, final_height 13 (i.e, 416/32 = 13) for 608x608 input image --> final_width 19, final_height 19 (608/32 = 19) for 832x832 input image --> final_width 26, final_height 26 and so on..

If you're using rectangular image size, calculate anchors accordingly. Based on the input width & height, the training images will be automatically resized by Darknet.

(excerpt from YOLO 9000 paper) We run k-means clustering on the dimensions of bounding boxes to get good priors for our model. The left image shows the average IOU we get with various choices for k. We find that k = 5 gives a good tradeoff for recall vs. complexity of the model.

num_of_clusters are chosen based on experimentation. This depends upon the size & shape of your objects in training. You can start with 5.

Did you check the performance of your current model?

wureka commented 6 years ago

@sivagnanamn the performance? are you saying the FPS of playing a video file ? When the tiny-yolo-voc.cfg is configured width x height to 640x384 (7 classes), it can play up to 11 FPS on my laptop with NVIDIA 940MX. If the width and height is changed to 480x288, the playing speed can be up to 16 FPS.

By the way, I found that the two results of anchors generated by below commands are sort of different, then I have no idea which one is correct or better.

./darknet detector calc_anchors cfg/ai.640x384.data -num_of_clusters 5 -final_width 20 -final_heigh 12

 num_of_clusters = 5, final_width = 20, final_height = 12 
 read labels from 559 images 
 loaded      image: 559      box: 894
 all loaded. 

 calculating k-means++ ...
 avg IoU = 71.46 % 

Saving anchors to the file: anchors.txt 
anchors = 6.0414,5.3336, 1.4120,1.2562, 2.2400,2.0053, 0.8012,0.6896, 4.1860,3.0086,

in gen_anchors.py, I did the below modifications:

width_in_cfg_file = 640.
height_in_cfg_file = 384.
python2 ./gen_anchors.py -filelist ~/tmp/dataset_640x384/train.txt -num_clusters 5 -output_dir ~/tmp/dataset_640x384/anchors
iter 1: dists = 3021.83838158
iter 2: dists = 394.514464201
iter 3: dists = 107.186099336
iter 4: dists = 65.0807363799
iter 5: dists = 74.1214117813
iter 6: dists = 46.7226788028
iter 7: dists = 45.8583716592
iter 8: dists = 39.5272050956
iter 9: dists = 46.9709509529
iter 10: dists = 31.417453265
iter 11: dists = 27.9244318208
iter 12: dists = 20.571036546
iter 13: dists = 11.4104899905
iter 14: dists = 4.98283848931
iter 15: dists = 5.13740222041
iter 16: dists = 8.30260748213
iter 17: dists = 4.78700138387
iter 18: dists = 5.19822094812
('Centroids = ', array([[ 0.10760115,  0.16092995],
       [ 0.03990759,  0.0564709 ],
       [ 0.29265293,  0.41589096],
       [ 0.19325658,  0.24051992],
       [ 0.06934247,  0.1021724 ]]))
(5, 2)
('Anchors = ', array([[ 0.79815175,  0.67765079],
       [ 1.38684944,  1.22606879],
       [ 2.15202295,  1.93115945],
       [ 3.86513158,  2.88623907],
       [ 5.85305851,  4.99069155]]))
()
('centroids.shape', (5, 2))
cat ~/tmp/dataset_640x384/anchors/anchors5.txt 
0.80,0.68, 1.39,1.23, 2.15,1.93, 3.87,2.89, 5.85,4.99
0.719477

Currently I use the one generated by darknet, now the training comes to iteration 700:

Region Avg IOU: 0.670915, Class: 0.998630, Obj: 0.000000, No Obj: 0.000099, Avg Recall: 1.000000,  count: 5
Region Avg IOU: 0.852281, Class: 0.999031, Obj: 0.000000, No Obj: 0.000126, Avg Recall: 1.000000,  count: 7
Region Avg IOU: 0.541176, Class: 0.998838, Obj: 0.000000, No Obj: 0.000097, Avg Recall: 0.571429,  count: 7
Region Avg IOU: 0.709287, Class: 0.995932, Obj: 0.000000, No Obj: 0.000117, Avg Recall: 1.000000,  count: 6
Region Avg IOU: 0.638325, Class: 0.997921, Obj: 0.000000, No Obj: 0.000101, Avg Recall: 0.875000,  count: 8
Region Avg IOU: 0.727720, Class: 0.999902, Obj: 0.000000, No Obj: 0.000242, Avg Recall: 1.000000,  count: 4
Region Avg IOU: 0.608950, Class: 0.999312, Obj: 0.000000, No Obj: 0.000123, Avg Recall: 0.727273,  count: 11
Region Avg IOU: 0.709739, Class: 0.957349, Obj: 0.000004, No Obj: 0.000108, Avg Recall: 1.000000,  count: 7
688: 0.095692, 0.142007 avg, 0.001000 rate, 5.063898 seconds, 22016 images
Loaded: 0.000077 seconds

but I found when using

./darknet detector test cfg/ai.640x384.data cfg/tiny-yolo-ai.640x384.cfg  ./backup/tiny-yolo-ai_700.weights ~/tmp/dataset_640x384/images/car/vid01_021733.jpg 

darknet detects nothing I will try another result generated by gen_anchors.py.

AlexeyAB commented 6 years ago

@wureka If you use tiny-yolo, then you should trained at least 2000 iterations before you will test detection.