pjreddie / darknet

Convolutional Neural Networks
http://pjreddie.com/darknet/
Other
25.82k stars 21.32k forks source link

Training not working, receiving several "nan"? #644

Open IMABUNNEH opened 6 years ago

IMABUNNEH commented 6 years ago

Hi,

I'm trying to train darknet using satellite images. I'm using a Vedai data set, see here: https://downloads.greyc.fr/vedai/ which comes with pre-annotated images.

When training, using command: ./darknet detector train cfg/voc-test.data cfg/yolov3-voc-test.cfg darknet53.conv.74

(the cfg file is a copy of yolov3-voc.cfg, the .data file obviously is doing its own thing).

My output lines look like this:

Region 82 Avg IOU: nan, Class: nan, Obj: nan, No Obj: 0.006032, .5R: nan, .75R: nan,  count: 0
Region 94 Avg IOU: nan, Class: nan, Obj: nan, No Obj: 0.002985, .5R: nan, .75R: nan,  count: 0
Region 106 Avg IOU: nan, Class: nan, Obj: nan, No Obj: 0.001486, .5R: nan, .75R: nan,  count: 0
303: 0.062799, 0.065735 avg, 0.000008 rate, 41.358321 seconds, 303 images
Loaded: 0.000071 seconds
Region 82 Avg IOU: nan, Class: nan, Obj: nan, No Obj: 0.006273, .5R: nan, .75R: nan,  count: 0
Region 94 Avg IOU: nan, Class: nan, Obj: nan, No Obj: 0.003000, .5R: nan, .75R: nan,  count: 0
Region 106 Avg IOU: nan, Class: nan, Obj: nan, No Obj: 0.001442, .5R: nan, .75R: nan,  count: 0
304: 0.066957, 0.065857 avg, 0.000009 rate, 41.720671 seconds, 304 images

I can't work out what the issue is, the annotations appear to be correct which was my first thought.

ahsan856jalal commented 6 years ago

I think the bounding boxes for the objects are smaller than 1% of the width and height of the overall image size. So I will suggest you to use AlexeyAB's fork of darknet and pur small_object=1 in the [yolo] layer

Regards Ahsan

On Thu, Apr 5, 2018 at 1:27 PM, IMABUNNEH notifications@github.com wrote:

Hi,

I'm trying to train darknet using satellite images. I'm using a Vedai data set, see here: https://downloads.greyc.fr/vedai/ which comes with pre-annotated images.

When training, using command: ./darknet detector train cfg/voc-test.data cfg/yolov3-voc-test.cfg darknet53.conv.74

(the cfg file is a copy of yolov3-voc.cfg, the .data file obviously is doing its own thing).

My output lines look like this:

Region 82 Avg IOU: nan, Class: nan, Obj: nan, No Obj: 0.006032, .5R: nan, .75R: nan, count: 0 Region 94 Avg IOU: nan, Class: nan, Obj: nan, No Obj: 0.002985, .5R: nan, .75R: nan, count: 0 Region 106 Avg IOU: nan, Class: nan, Obj: nan, No Obj: 0.001486, .5R: nan, .75R: nan, count: 0 303: 0.062799, 0.065735 avg, 0.000008 rate, 41.358321 seconds, 303 images Loaded: 0.000071 seconds Region 82 Avg IOU: nan, Class: nan, Obj: nan, No Obj: 0.006273, .5R: nan, .75R: nan, count: 0 Region 94 Avg IOU: nan, Class: nan, Obj: nan, No Obj: 0.003000, .5R: nan, .75R: nan, count: 0 Region 106 Avg IOU: nan, Class: nan, Obj: nan, No Obj: 0.001442, .5R: nan, .75R: nan, count: 0 304: 0.066957, 0.065857 avg, 0.000009 rate, 41.720671 seconds, 304 images

I can't work out what the issue is, the annotations appear to be correct which was my first thought.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/pjreddie/darknet/issues/644, or mute the thread https://github.com/notifications/unsubscribe-auth/AK9zjvL3bmXmu3ObzTjW32QLhfFjtF-Uks5tldV0gaJpZM4TIC6Q .

tigerdhl commented 6 years ago

@IMABUNNEH hi,do you fix your problem? I get the same problem as follow: Loading weights from darknet53.conv.74...1 yolov3-voc Done! Learning Rate: 1e-06, Momentum: 0.9, Decay: 0.0005 Loaded: 0.694139 seconds Region 82 Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.433165, .5R: -nan, .75R: -nan, count: 0 Region 94 Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.596311, .5R: -nan, .75R: -nan, count: 0 Region 106 Avg IOU: 0.127817, Class: 0.475858, Obj: 0.168880, No Obj: 0.497860, .5R: 0.125000, .75R: 0.000000, count: 8 Region 82 Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.435357, .5R: -nan, .75R: -nan, count: 0 Region 94 Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.593578, .5R: -nan, .75R: -nan, count: 0 Region 106 Avg IOU: 0.159961, Class: 0.241856, Obj: 0.190167, No Obj: 0.500917, .5R: 0.000000, .75R: 0.000000, count: 6 Region 82 Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.434504, .5R: -nan, .75R: -nan, count: 0 Region 94 Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.588221, .5R: -nan, .75R: -nan, count: 0 Region 106 Avg IOU: 0.078549, Class: 0.633850, Obj: 0.339179, No Obj: 0.497340, .5R: 0.000000, .75R: 0.000000, count: 6

tigerdhl commented 6 years ago

@ahsan856jalal thank you,but I don't understand how to set small_object=1

Pattorio commented 6 years ago

@IMABUNNEH @tigerdhl hey guys, have you fix the problem yet?

I think you just set batch=1, so that it cannot train the net well. (since only one image for one iteration in your log) Go to yolov3.cfg (or the .cfg file you are using), set batch larger.

AnaRhisT94 commented 5 years ago

Check that your annotations are really in the correct format which is and obv. that they are normalized.