AlexeyAB / darknet

YOLOv4 / Scaled-YOLOv4 / YOLO - Neural Networks for Object Detection (Windows and Linux version of Darknet )
http://pjreddie.com/darknet/
Other
21.63k stars 7.95k forks source link

Got worse mAP in yolov2-tiny-voc with 608 x 608 resolution. #4549

Open hanseahn opened 4 years ago

hanseahn commented 4 years ago

In Original yolov2-voc, mAP were increased when the resolution was increased (416 x 416 -> 544 x 544).

So I assumed that the mAP of yolov2-tiny-voc will increase when the resolution was increased.

However, I got different result.

When I train yolov2-tiny-voc with 416 x 416 resolution, I got mAP about 45%. (command : ./darknet detector train voc.data yolov2-tiny-voc.cfg darknet53.conv.74 -map )

But, When I train yolov2-tiny-voc with 608 x 608 resolution, I got mAP about 35%. (command : ./darknet detector train voc.data yolov2-tiny-voc.cfg darknet53.conv.74 -map )

Is there a way to increase the mAP of yolov2-tiny-voc with 608 x 608 resolution? (better than yolov2-tiny-voc with 416 x 416)

below is my cfg setting. I changed width, height, and subdivision (Subdivision=2 cause cuda out of memory in my environment. Same with 416 x 416) from original yolov2-tiny-voc.cfg in cfg folder

[net]

Testing

batch=1

subdivisions=1

Training

batch=64 subdivisions=4

width=608 height=608 channels=3 momentum=0.9 decay=0.0005 angle=0 saturation = 1.5 exposure = 1.5 hue=.1

learning_rate=0.001 max_batches = 40200 policy=steps steps=-1,100,20000,30000 scales=.1,10,.1,.1

[convolutional] batch_normalize=1 filters=16 size=3 stride=1 pad=1 activation=leaky

[maxpool] size=2 stride=2

[convolutional] batch_normalize=1 filters=32 size=3 stride=1 pad=1 activation=leaky

[maxpool] size=2 stride=2

[convolutional] batch_normalize=1 filters=64 size=3 stride=1 pad=1 activation=leaky

[maxpool] size=2 stride=2

[convolutional] batch_normalize=1 filters=128 size=3 stride=1 pad=1 activation=leaky

[maxpool] size=2 stride=2

[convolutional] batch_normalize=1 filters=256 size=3 stride=1 pad=1 activation=leaky

[maxpool] size=2 stride=2

[convolutional] batch_normalize=1 filters=512 size=3 stride=1 pad=1 activation=leaky

[maxpool] size=2 stride=1

[convolutional] batch_normalize=1 filters=1024 size=3 stride=1 pad=1 activation=leaky

###########

[convolutional] batch_normalize=1 size=3 stride=1 pad=1 filters=1024 activation=leaky

[convolutional] size=1 stride=1 pad=1 filters=125 activation=linear

[region] anchors = 1.08,1.19, 3.42,4.41, 6.63,11.38, 9.42,5.11, 16.62,10.52 bias_match=1 classes=20 coords=4 num=5 softmax=1 jitter=.2 rescore=1

object_scale=5 noobject_scale=1 class_scale=1 coord_scale=1

absolute=1 thresh = .6 random=1

AlexeyAB commented 4 years ago

What repo did you use?

hanseahn commented 4 years ago

Your repo, not pjreddie's.

AlexeyAB commented 4 years ago

Either lower subdivisions increases the mAP for 416x416, or your objects are large, so you shouldn't use 608x608.