pjreddie / darknet

Convolutional Neural Networks
http://pjreddie.com/darknet/
Other
25.84k stars 21.33k forks source link

Why Training on Yolo v2 doesn't reach to lower loss less than 1? #482

Open nyanmn opened 6 years ago

nyanmn commented 6 years ago

I am training Yolo V2 for my customized objects. I never achieved the loss less than 1. I used about 700 images but each image has 10- 20 objects inside. I trained only for one class.

My configuration file has the following setting at the network

[net]
# Testing
batch=32
subdivisions=8
# Training
# batch=64
# subdivisions=8
height=416
width=416
channels=3
momentum=0.9
decay=0.0005
angle=0
saturation = 1.5
exposure = 1.5
hue=.1

learning_rate=0.001
burn_in=1000
max_batches = 40000
policy=steps
steps=30000,40000
scales=.1,.1

The lowest I had in loss is about 6. I trained for 30,000 times but I had 6-7 loss since 10,000 iterations.

Some of the info in training are

31415: 8.348057, 8.386817 avg, 0.000100 rate, 12.111334 seconds, 1005280 images
Loaded: 0.000060 seconds
Region Avg IOU: 0.615588, Class: 1.000000, Obj: 0.401514, No Obj: 0.004181, Avg Recall: 0.666667,  count: 15
Region Avg IOU: 0.822111, Class: 1.000000, Obj: 0.672842, No Obj: 0.007252, Avg Recall: 1.000000,  count: 12
Region Avg IOU: 0.804924, Class: 1.000000, Obj: 0.429950, No Obj: 0.003716, Avg Recall: 1.000000,  count: 10
Region Avg IOU: 0.491349, Class: 1.000000, Obj: 0.440711, No Obj: 0.008872, Avg Recall: 0.521739,  count: 23
Region Avg IOU: 0.807440, Class: 1.000000, Obj: 0.737001, No Obj: 0.005962, Avg Recall: 1.000000,  count: 10
Region Avg IOU: 0.575739, Class: 1.000000, Obj: 0.533677, No Obj: 0.005668, Avg Recall: 0.555556,  count: 18
Region Avg IOU: 0.698396, Class: 1.000000, Obj: 0.679842, No Obj: 0.014044, Avg Recall: 0.735294,  count: 34
Region Avg IOU: 0.724214, Class: 1.000000, Obj: 0.658260, No Obj: 0.022814, Avg Recall: 0.862745,  count: 51
31416: 7.889908, 8.337126 avg, 0.000100 rate, 10.935622 seconds, 1005312 images
Loaded: 0.000066 seconds
Region Avg IOU: 0.745393, Class: 1.000000, Obj: 0.562690, No Obj: 0.010314, Avg Recall: 0.857143,  count: 21
Region Avg IOU: 0.584905, Class: 1.000000, Obj: 0.590116, No Obj: 0.006717, Avg Recall: 0.789474,  count: 19
Region Avg IOU: 0.392405, Class: 1.000000, Obj: 0.367634, No Obj: 0.007094, Avg Recall: 0.404762,  count: 42
Region Avg IOU: 0.380257, Class: 1.000000, Obj: 0.348274, No Obj: 0.002768, Avg Recall: 0.350000,  count: 20
Region Avg IOU: 0.544494, Class: 1.000000, Obj: 0.535225, No Obj: 0.016623, Avg Recall: 0.603175,  count: 63
Region Avg IOU: 0.769785, Class: 1.000000, Obj: 0.656626, No Obj: 0.018179, Avg Recall: 1.000000,  count: 26
Region Avg IOU: 0.580423, Class: 1.000000, Obj: 0.562693, No Obj: 0.014675, Avg Recall: 0.630435,  count: 46
Region Avg IOU: 0.525704, Class: 1.000000, Obj: 0.497039, No Obj: 0.009762, Avg Recall: 0.588235,  count: 34
31417: 8.580651, 8.361479 avg, 0.000100 rate, 11.232055 seconds, 1005344 images
Loaded: 0.000060 seconds
Region Avg IOU: 0.742997, Class: 1.000000, Obj: 0.615459, No Obj: 0.007021, Avg Recall: 0.833333,  count: 18
Region Avg IOU: 0.495435, Class: 1.000000, Obj: 0.482534, No Obj: 0.013241, Avg Recall: 0.581818,  count: 55
Region Avg IOU: 0.653591, Class: 1.000000, Obj: 0.376574, No Obj: 0.007412, Avg Recall: 0.782609,  count: 23
Region Avg IOU: 0.648442, Class: 1.000000, Obj: 0.630538, No Obj: 0.007765, Avg Recall: 0.666667,  count: 15
Region Avg IOU: 0.586668, Class: 1.000000, Obj: 0.567470, No Obj: 0.007282, Avg Recall: 0.708333,  count: 24
Region Avg IOU: 0.711382, Class: 1.000000, Obj: 0.635664, No Obj: 0.011141, Avg Recall: 0.833333,  count: 24
Region Avg IOU: 0.669231, Class: 1.000000, Obj: 0.619737, No Obj: 0.012247, Avg Recall: 0.814815,  count: 27
Region Avg IOU: 0.394252, Class: 1.000000, Obj: 0.236007, No Obj: 0.005367, Avg Recall: 0.441176,  count: 34
31418: 9.481880, 8.473519 avg, 0.000100 rate, 12.165609 seconds, 1005376 images
Loaded: 0.000189 seconds
Region Avg IOU: 0.630928, Class: 1.000000, Obj: 0.511797, No Obj: 0.004950, Avg Recall: 0.764706,  count: 17
Region Avg IOU: 0.831299, Class: 1.000000, Obj: 0.780635, No Obj: 0.007082, Avg Recall: 1.000000,  count: 12
Region Avg IOU: 0.644808, Class: 1.000000, Obj: 0.578301, No Obj: 0.014045, Avg Recall: 0.738095,  count: 42
Region Avg IOU: 0.717070, Class: 1.000000, Obj: 0.622271, No Obj: 0.013107, Avg Recall: 0.875000,  count: 24
Region Avg IOU: 0.556818, Class: 1.000000, Obj: 0.522573, No Obj: 0.011312, Avg Recall: 0.595238,  count: 42
Region Avg IOU: 0.571265, Class: 1.000000, Obj: 0.510450, No Obj: 0.007567, Avg Recall: 0.740741,  count: 27
Region Avg IOU: 0.623863, Class: 1.000000, Obj: 0.583767, No Obj: 0.007240, Avg Recall: 0.684211,  count: 19
Region Avg IOU: 0.798751, Class: 1.000000, Obj: 0.711914, No Obj: 0.013085, Avg Recall: 0.960000,  count: 25
31419: 6.443977, 8.270565 avg, 0.000100 rate, 11.671758 seconds, 1005408 images
Loaded: 0.000039 seconds
Region Avg IOU: 0.701247, Class: 1.000000, Obj: 0.687701, No Obj: 0.013300, Avg Recall: 0.812500,  count: 32
Region Avg IOU: 0.626426, Class: 1.000000, Obj: 0.566588, No Obj: 0.013742, Avg Recall: 0.707317,  count: 41
Region Avg IOU: 0.470713, Class: 1.000000, Obj: 0.437131, No Obj: 0.006095, Avg Recall: 0.482759,  count: 29
Region Avg IOU: 0.643689, Class: 1.000000, Obj: 0.528787, No Obj: 0.010996, Avg Recall: 0.800000,  count: 25
Region Avg IOU: 0.557152, Class: 1.000000, Obj: 0.471046, No Obj: 0.006122, Avg Recall: 0.571429,  count: 14
Region Avg IOU: 0.775964, Class: 1.000000, Obj: 0.681793, No Obj: 0.007043, Avg Recall: 1.000000,  count: 12
Region Avg IOU: 0.810874, Class: 1.000000, Obj: 0.715747, No Obj: 0.006037, Avg Recall: 0.916667,  count: 12

I like to know so that I am aware of the setting has some issues or what I need to improve to have loss lower. Anchors and ground truth have 0.84 IOU in my training.

Grabber commented 6 years ago

Try to recalculate the anchors.

On Wed, Feb 21, 2018 at 10:46 PM, nyanmn notifications@github.com wrote:

I am training Yolo V2 for my customized objects. I never achieved the loss less than 1. I used about 700 images but each image has 10- 20 objects inside. I trained only for one class.

My configuration file has the following setting at the network

[net]

Testing

batch=32 subdivisions=8

Training

batch=64

subdivisions=8

height=416 width=416 channels=3 momentum=0.9 decay=0.0005 angle=0 saturation = 1.5 exposure = 1.5 hue=.1

learning_rate=0.001 burn_in=1000 max_batches = 40000 policy=steps steps=30000,40000 scales=.1,.1

The lowest I had in loss is about 6. I trained for 30,000 times but I had 6-7 loss since 10,000 iterations.

Some of the info in training are

31415: 8.348057, 8.386817 avg, 0.000100 rate, 12.111334 seconds, 1005280 images Loaded: 0.000060 seconds Region Avg IOU: 0.615588, Class: 1.000000, Obj: 0.401514, No Obj: 0.004181, Avg Recall: 0.666667, count: 15 Region Avg IOU: 0.822111, Class: 1.000000, Obj: 0.672842, No Obj: 0.007252, Avg Recall: 1.000000, count: 12 Region Avg IOU: 0.804924, Class: 1.000000, Obj: 0.429950, No Obj: 0.003716, Avg Recall: 1.000000, count: 10 Region Avg IOU: 0.491349, Class: 1.000000, Obj: 0.440711, No Obj: 0.008872, Avg Recall: 0.521739, count: 23 Region Avg IOU: 0.807440, Class: 1.000000, Obj: 0.737001, No Obj: 0.005962, Avg Recall: 1.000000, count: 10 Region Avg IOU: 0.575739, Class: 1.000000, Obj: 0.533677, No Obj: 0.005668, Avg Recall: 0.555556, count: 18 Region Avg IOU: 0.698396, Class: 1.000000, Obj: 0.679842, No Obj: 0.014044, Avg Recall: 0.735294, count: 34 Region Avg IOU: 0.724214, Class: 1.000000, Obj: 0.658260, No Obj: 0.022814, Avg Recall: 0.862745, count: 51 31416: 7.889908, 8.337126 avg, 0.000100 rate, 10.935622 seconds, 1005312 images Loaded: 0.000066 seconds Region Avg IOU: 0.745393, Class: 1.000000, Obj: 0.562690, No Obj: 0.010314, Avg Recall: 0.857143, count: 21 Region Avg IOU: 0.584905, Class: 1.000000, Obj: 0.590116, No Obj: 0.006717, Avg Recall: 0.789474, count: 19 Region Avg IOU: 0.392405, Class: 1.000000, Obj: 0.367634, No Obj: 0.007094, Avg Recall: 0.404762, count: 42 Region Avg IOU: 0.380257, Class: 1.000000, Obj: 0.348274, No Obj: 0.002768, Avg Recall: 0.350000, count: 20 Region Avg IOU: 0.544494, Class: 1.000000, Obj: 0.535225, No Obj: 0.016623, Avg Recall: 0.603175, count: 63 Region Avg IOU: 0.769785, Class: 1.000000, Obj: 0.656626, No Obj: 0.018179, Avg Recall: 1.000000, count: 26 Region Avg IOU: 0.580423, Class: 1.000000, Obj: 0.562693, No Obj: 0.014675, Avg Recall: 0.630435, count: 46 Region Avg IOU: 0.525704, Class: 1.000000, Obj: 0.497039, No Obj: 0.009762, Avg Recall: 0.588235, count: 34 31417: 8.580651, 8.361479 avg, 0.000100 rate, 11.232055 seconds, 1005344 images Loaded: 0.000060 seconds Region Avg IOU: 0.742997, Class: 1.000000, Obj: 0.615459, No Obj: 0.007021, Avg Recall: 0.833333, count: 18 Region Avg IOU: 0.495435, Class: 1.000000, Obj: 0.482534, No Obj: 0.013241, Avg Recall: 0.581818, count: 55 Region Avg IOU: 0.653591, Class: 1.000000, Obj: 0.376574, No Obj: 0.007412, Avg Recall: 0.782609, count: 23 Region Avg IOU: 0.648442, Class: 1.000000, Obj: 0.630538, No Obj: 0.007765, Avg Recall: 0.666667, count: 15 Region Avg IOU: 0.586668, Class: 1.000000, Obj: 0.567470, No Obj: 0.007282, Avg Recall: 0.708333, count: 24 Region Avg IOU: 0.711382, Class: 1.000000, Obj: 0.635664, No Obj: 0.011141, Avg Recall: 0.833333, count: 24 Region Avg IOU: 0.669231, Class: 1.000000, Obj: 0.619737, No Obj: 0.012247, Avg Recall: 0.814815, count: 27 Region Avg IOU: 0.394252, Class: 1.000000, Obj: 0.236007, No Obj: 0.005367, Avg Recall: 0.441176, count: 34 31418: 9.481880, 8.473519 avg, 0.000100 rate, 12.165609 seconds, 1005376 images Loaded: 0.000189 seconds Region Avg IOU: 0.630928, Class: 1.000000, Obj: 0.511797, No Obj: 0.004950, Avg Recall: 0.764706, count: 17 Region Avg IOU: 0.831299, Class: 1.000000, Obj: 0.780635, No Obj: 0.007082, Avg Recall: 1.000000, count: 12 Region Avg IOU: 0.644808, Class: 1.000000, Obj: 0.578301, No Obj: 0.014045, Avg Recall: 0.738095, count: 42 Region Avg IOU: 0.717070, Class: 1.000000, Obj: 0.622271, No Obj: 0.013107, Avg Recall: 0.875000, count: 24 Region Avg IOU: 0.556818, Class: 1.000000, Obj: 0.522573, No Obj: 0.011312, Avg Recall: 0.595238, count: 42 Region Avg IOU: 0.571265, Class: 1.000000, Obj: 0.510450, No Obj: 0.007567, Avg Recall: 0.740741, count: 27 Region Avg IOU: 0.623863, Class: 1.000000, Obj: 0.583767, No Obj: 0.007240, Avg Recall: 0.684211, count: 19 Region Avg IOU: 0.798751, Class: 1.000000, Obj: 0.711914, No Obj: 0.013085, Avg Recall: 0.960000, count: 25 31419: 6.443977, 8.270565 avg, 0.000100 rate, 11.671758 seconds, 1005408 images Loaded: 0.000039 seconds Region Avg IOU: 0.701247, Class: 1.000000, Obj: 0.687701, No Obj: 0.013300, Avg Recall: 0.812500, count: 32 Region Avg IOU: 0.626426, Class: 1.000000, Obj: 0.566588, No Obj: 0.013742, Avg Recall: 0.707317, count: 41 Region Avg IOU: 0.470713, Class: 1.000000, Obj: 0.437131, No Obj: 0.006095, Avg Recall: 0.482759, count: 29 Region Avg IOU: 0.643689, Class: 1.000000, Obj: 0.528787, No Obj: 0.010996, Avg Recall: 0.800000, count: 25 Region Avg IOU: 0.557152, Class: 1.000000, Obj: 0.471046, No Obj: 0.006122, Avg Recall: 0.571429, count: 14 Region Avg IOU: 0.775964, Class: 1.000000, Obj: 0.681793, No Obj: 0.007043, Avg Recall: 1.000000, count: 12 Region Avg IOU: 0.810874, Class: 1.000000, Obj: 0.715747, No Obj: 0.006037, Avg Recall: 0.916667, count: 12

I like to know so that I am aware of the setting has some issues or what I need to improve to have loss lower. Anchors and ground truth have 0.84 IOU in my training.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/pjreddie/darknet/issues/482, or mute the thread https://github.com/notifications/unsubscribe-auth/AAA9cwvlE83UrgiTXFRs06xaxRYND9Q3ks5tXMbvgaJpZM4SOncK .

-- Regards,

Luiz Vitor Martinez Cardoso

"The only limits are the ones you place upon yourself"

nyanmn commented 6 years ago

You mean having higher IOU makes loss gets lower. My anchors and GT boxes have 0.84 IOU already. How much still need to improve? Is it the only Anchors issue? What else could be?

heenim33 commented 6 years ago

you can try './darknet detector train cfg/voc.data cfg/yolo-voc.2.0.cfg darknet19_448.conv.23' instead of './darknet detector train cfg/voc.data cfg/yolo-voc.cfg darknet19_448.conv.23'

nyanmn commented 6 years ago

Sure I'll try and update.

mhaghighat commented 6 years ago

@nyanmn: I have a similar issue. Did you figure out what parameters to change to lower the loss? I was afraid that the training was stuck in a local minimum, so, I tried tweaking the learning_rate and and the number of anchors. But they do not make the difference that I'm looking for. Thanks

PraveenJP commented 5 years ago

@nyanmn @mhaghighat Me too have a similar issue on Yolov3 tiny training. Did you find out the issue?