AlexeyAB / darknet

YOLOv4 / Scaled-YOLOv4 / YOLO - Neural Networks for Object Detection (Windows and Linux version of Darknet )
http://pjreddie.com/darknet/
Other
21.65k stars 7.96k forks source link

map scores are all 0 #820

Open Karthik-Suresh93 opened 6 years ago

Karthik-Suresh93 commented 6 years ago

Hi,

I am training YOLOv3 on my custom dataset. I trained for about 3400 iterations and the error seems to be converging at around 0.3. However, when I run the map test with ./darknet detector map... I get the results as shown below:

for thresh = 0.25, precision = -nan, recall = 0.00, F1-score = -nan for thresh = 0.25, TP = 0, FP = 0, FN = 42846, average IoU = -nan %

mean average precision (mAP) = 0.000000, or 0.00 % Total Detection Time: 53.000000 Seconds

I don't understand why this is happening. My data file looks like this:

classes= 12 train = data/train.txt valid = data/test.txt names = data/obj.names backup = backup/

Please help me and let me know where the training is going wrong

AlexeyAB commented 6 years ago

Hi, Try to set valid = data/train.txt And provide more information, did you use Yolo_mark, what did you change in the cfg-file...

Karthik-Suresh93 commented 6 years ago

Hi,

I tried to set valid = data/train.txt and got the same results.

I suspect it is because of the annotations. I used a custom dataset where they had given: <x>, <y>, <w>, <h> where x and y are the x-y coordinates of the top left corner of the bounding box. [I know yolo expects the center coordinates of the bounding box]

I used the following code to convert it into yolo format:

`def convert(size, box):

dw = 1./size[0]
dh = 1./size[1]

##box[2] is the width of the bounding box, box[3] is the height of the bounding box. 
##box[0] and box[1] are the x and y coordinates of the top left corners of the bounding box respectively

##convert from top left coordiantes to center coordinates

box[0] = float(box[0]) + float(box[2])/2.0
box[1] = float(box[1]) + float(box[3])/2.0

x = box[0]*dw
y = box[1]*dh
w = float(box[2])*dw
h = float(box[3])*dh
return (x,y,w,h)`

After using this code, I was able to convert all numbers between 0 and 1 as mentioned in your github page. The link to the data format is here: Please let me know if there is any mistake in my approach

Thanks and Regards, Karthik

AlexeyAB commented 6 years ago

Open your dataset in the https://github.com/AlexeyAB/Yolo_mark

Karthik-Suresh93 commented 6 years ago

By open my dataset do you mean open a new issue with the link of my data in the yolo_mark link you've provided?

AlexeyAB commented 6 years ago

No, I mean install Yolo_mark and run command: ./yolo_mark ./img ./data/train.txt ./data/obj.names

And you will see whether you marked the objects correctly.

Karthik-Suresh93 commented 6 years ago

I tried the Yolo_mark and all the objects are approximately correctly marked.

AlexeyAB commented 6 years ago

@Karthik-Suresh93 Can you detect anything using your cfg/weights? What batch and subdivision did you use for training? What also did you change in your cfg file?

Karthik-Suresh93 commented 6 years ago

I figured out my mistake. I was training with the coordinates for the top left corner of the bounding box instead of the center. I get a map of 0.01% after 1000 iterations after I corrected this mistake. I will close this issue after checking the map again after a few thousand more iterations. Thank you very much for your help

AlexeyAB commented 6 years ago

Thats why I always recommend to use https://github.com/AlexeyAB/Yolo_mark

Karthik-Suresh93 commented 6 years ago

The error seems to be stuck at 7.7-8 from the last 1000 iterations or so ( I am currently in iteration 3134, dataset has 12 classes and is a very difficult dataset). I am suspicious about the anchor sizes calculated by calc_anchors. Here is my output: anchors = 13.0788,16.8780, 31.4323,356.1869, 13.3417,964.0239, 289.3022,81.5903, 16.4985,1743.8701, 15.8474,2934.3330, 957.3582,210.2993, 1745.9518,205.4940, 2944.1074,208.2814

Aren't some anchors too big? Also, my image dimensions are varying in the training set, although all of them are greater than 448x448. Could you please let me know if there is an issue here?

AlexeyAB commented 6 years ago

Can you show entire [net] section from your cfg-file?

Karthik-Suresh93 commented 6 years ago

[net]

Testing

batch=1

subdivisions=1

Training

batch=64 subdivisions=16 width=416 height=416 channels=3 momentum=0.9 decay=0.0005 angle=0 saturation = 1.5 exposure = 1.5 hue=.1

learning_rate=0.001 burn_in=1000 max_batches = 50200 policy=steps steps=40000,45000 scales=.1,.1

[convolutional] batch_normalize=1 filters=32 size=3 stride=1 pad=1 activation=leaky

Downsample

[convolutional] batch_normalize=1 filters=64 size=3 stride=2 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=32 size=1 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=64 size=3 stride=1 pad=1 activation=leaky

[shortcut] from=-3 activation=linear

Downsample

[convolutional] batch_normalize=1 filters=128 size=3 stride=2 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=64 size=1 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=128 size=3 stride=1 pad=1 activation=leaky

[shortcut] from=-3 activation=linear

[convolutional] batch_normalize=1 filters=64 size=1 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=128 size=3 stride=1 pad=1 activation=leaky

[shortcut] from=-3 activation=linear

Downsample

[convolutional] batch_normalize=1 filters=256 size=3 stride=2 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=128 size=1 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=256 size=3 stride=1 pad=1 activation=leaky

[shortcut] from=-3 activation=linear

[convolutional] batch_normalize=1 filters=128 size=1 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=256 size=3 stride=1 pad=1 activation=leaky

[shortcut] from=-3 activation=linear

[convolutional] batch_normalize=1 filters=128 size=1 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=256 size=3 stride=1 pad=1 activation=leaky

[shortcut] from=-3 activation=linear

[convolutional] batch_normalize=1 filters=128 size=1 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=256 size=3 stride=1 pad=1 activation=leaky

[shortcut] from=-3 activation=linear

[convolutional] batch_normalize=1 filters=128 size=1 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=256 size=3 stride=1 pad=1 activation=leaky

[shortcut] from=-3 activation=linear

[convolutional] batch_normalize=1 filters=128 size=1 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=256 size=3 stride=1 pad=1 activation=leaky

[shortcut] from=-3 activation=linear

[convolutional] batch_normalize=1 filters=128 size=1 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=256 size=3 stride=1 pad=1 activation=leaky

[shortcut] from=-3 activation=linear

[convolutional] batch_normalize=1 filters=128 size=1 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=256 size=3 stride=1 pad=1 activation=leaky

[shortcut] from=-3 activation=linear

Downsample

[convolutional] batch_normalize=1 filters=512 size=3 stride=2 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=256 size=1 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=512 size=3 stride=1 pad=1 activation=leaky

[shortcut] from=-3 activation=linear

[convolutional] batch_normalize=1 filters=256 size=1 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=512 size=3 stride=1 pad=1 activation=leaky

[shortcut] from=-3 activation=linear

[convolutional] batch_normalize=1 filters=256 size=1 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=512 size=3 stride=1 pad=1 activation=leaky

[shortcut] from=-3 activation=linear

[convolutional] batch_normalize=1 filters=256 size=1 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=512 size=3 stride=1 pad=1 activation=leaky

[shortcut] from=-3 activation=linear

[convolutional] batch_normalize=1 filters=256 size=1 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=512 size=3 stride=1 pad=1 activation=leaky

[shortcut] from=-3 activation=linear

[convolutional] batch_normalize=1 filters=256 size=1 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=512 size=3 stride=1 pad=1 activation=leaky

[shortcut] from=-3 activation=linear

[convolutional] batch_normalize=1 filters=256 size=1 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=512 size=3 stride=1 pad=1 activation=leaky

[shortcut] from=-3 activation=linear

[convolutional] batch_normalize=1 filters=256 size=1 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=512 size=3 stride=1 pad=1 activation=leaky

[shortcut] from=-3 activation=linear

Downsample

[convolutional] batch_normalize=1 filters=1024 size=3 stride=2 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=512 size=1 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=1024 size=3 stride=1 pad=1 activation=leaky

[shortcut] from=-3 activation=linear

[convolutional] batch_normalize=1 filters=512 size=1 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=1024 size=3 stride=1 pad=1 activation=leaky

[shortcut] from=-3 activation=linear

[convolutional] batch_normalize=1 filters=512 size=1 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=1024 size=3 stride=1 pad=1 activation=leaky

[shortcut] from=-3 activation=linear

[convolutional] batch_normalize=1 filters=512 size=1 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=1024 size=3 stride=1 pad=1 activation=leaky

[shortcut] from=-3 activation=linear

######################

[convolutional] batch_normalize=1 filters=512 size=1 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 size=3 stride=1 pad=1 filters=1024 activation=leaky

[convolutional] batch_normalize=1 filters=512 size=1 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 size=3 stride=1 pad=1 filters=1024 activation=leaky

[convolutional] batch_normalize=1 filters=512 size=1 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 size=3 stride=1 pad=1 filters=1024 activation=leaky

[convolutional] size=1 stride=1 pad=1 filters=51 activation=linear

[yolo] mask = 6,7,8 anchors = 13.0788,16.8780, 31.4323,356.1869, 13.3417,964.0239, 289.3022,81.5903, 16.4985,1743.8701, 15.8474,2934.3330, 957.3582,210.2993, 1745.9518,205.4940, 2944.1074,208.2814 classes=12 num=9 jitter=.3 ignore_thresh = .5 truth_thresh = 1 random=1

[route] layers = -4

[convolutional] batch_normalize=1 filters=256 size=1 stride=1 pad=1 activation=leaky

[upsample] stride=2

[route] layers = -1, 61

[convolutional] batch_normalize=1 filters=256 size=1 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 size=3 stride=1 pad=1 filters=512 activation=leaky

[convolutional] batch_normalize=1 filters=256 size=1 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 size=3 stride=1 pad=1 filters=512 activation=leaky

[convolutional] batch_normalize=1 filters=256 size=1 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 size=3 stride=1 pad=1 filters=512 activation=leaky

[convolutional] size=1 stride=1 pad=1 filters=51 activation=linear

[yolo] mask = 3,4,5 anchors = 13.0788,16.8780, 31.4323,356.1869, 13.3417,964.0239, 289.3022,81.5903, 16.4985,1743.8701, 15.8474,2934.3330, 957.3582,210.2993, 1745.9518,205.4940, 2944.1074,208.2814 classes=12 num=9 jitter=.3 ignore_thresh = .5 truth_thresh = 1 random=1

[route] layers = -4

[convolutional] batch_normalize=1 filters=128 size=1 stride=1 pad=1 activation=leaky

[upsample] stride=2

[route] layers = -1, 36

[convolutional] batch_normalize=1 filters=128 size=1 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 size=3 stride=1 pad=1 filters=256 activation=leaky

[convolutional] batch_normalize=1 filters=128 size=1 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 size=3 stride=1 pad=1 filters=256 activation=leaky

[convolutional] batch_normalize=1 filters=128 size=1 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 size=3 stride=1 pad=1 filters=256 activation=leaky

[convolutional] size=1 stride=1 pad=1 filters=51 activation=linear

[yolo] mask = 0,1,2 anchors = 13.0788,16.8780, 31.4323,356.1869, 13.3417,964.0239, 289.3022,81.5903, 16.4985,1743.8701, 15.8474,2934.3330, 957.3582,210.2993, 1745.9518,205.4940, 2944.1074,208.2814 classes=12 num=9 jitter=.3 ignore_thresh = .5 truth_thresh = 1 random=1

AlexeyAB commented 6 years ago

anchors = 13.0788,16.8780, 31.4323,356.1869, 13.3417,964.0239, 289.3022,81.5903, 16.4985,1743.8701, 15.8474,2934.3330, 957.3582,210.2993, 1745.9518,205.4940, 2944.1074,208.2814

Aren't some anchors too big?

Too big. Looks like you made a mistake when you calculated your anchors.

Try to recalculate anchors: ./darknet detector calc_anchors data/obj.data -num_of_clusters 9 -width 416 -height 416

Karthik-Suresh93 commented 6 years ago

I recalculated and obtained the same anchors.

./darknet detector calc_anchors data/visdrone.data -num_of_clusters 9 -width 416 -height 416

num_of_clusters = 9, width = 416, height = 416 read labels from 6471 images loaded image: 6471 box: 370238 all loaded.

calculating k-means++ ... avg IoU = 30.36 %

Saving anchors to the file: anchors.txt anchors = 13.0788,16.8780, 31.4323,356.1869, 13.3417,964.0239, 289.3022,81.5903, 16.4985,1743.8701, 15.8474,2934.3330, 957.3582,210.2993, 1745.9518,205.4940, 2944.1074,208.2814

What could be the issue?

AlexeyAB commented 6 years ago

It seems your labels have values that more then 1.0

Can you share your dataset using google-disk? Or if it is ~small, just compress and drag-n-drop to your message.

Karthik-Suresh93 commented 6 years ago

https://drive.google.com/drive/folders/1JlHiIjlN2j-4PnKP0ATYSGFjQkMs157V?usp=sharing is the link for the label. Please let me know if you have difficulty accessing it

AlexeyAB commented 6 years ago

@Karthik-Suresh93 Can you provide your train.txt file?

Karthik-Suresh93 commented 6 years ago

I found a bug in my annotation files which I corrected. I obtained new anchors which are:

anchors = 4.1444,6.3716, 6.1866,14.4790, 15.3169,12.6948, 11.5459,26.3693, 29.4176,24.0713, 20.5760,43.9119, 55.5021,42.8288, 33.4039,72.5790, 77.2426,103.3075

I changed this in the cgf file (3 places) but still, the same thing is repeating. From iteration ~800 to ~1800, the average loss is about ~7.7 to 8 and map is 0.01% for 1700 weights. Why is this happening? Is this because of setting the image dimensions to 448x448, even though training data contains images of much larger dimensions? Or is it because of the batch and subdivisions?

Karthik-Suresh93 commented 6 years ago

train_visdrone.txt The train.txt file

Karthik-Suresh93 commented 6 years ago

1744: 8.052785, 7.794055 avg, 0.001000 rate, 18.263474 seconds, 111616 images Loaded: 0.000039 seconds Region 82 Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.000942, .5R: -nan, .75R: -nan, count: 0 Region 94 Avg IOU: 0.610213, Class: 0.646893, Obj: 0.171735, No Obj: 0.000749, .5R: 0.666667, .75R: 0.166667, count: 6 Region 106 Avg IOU: 0.474142, Class: 0.594815, Obj: 0.072844, No Obj: 0.000374, .5R: 0.571429, .75R: 0.142857, count: 7 Region 82 Avg IOU: 0.734617, Class: 0.332207, Obj: 0.131849, No Obj: 0.001483, .5R: 1.000000, .75R: 0.400000, count: 5 Region 94 Avg IOU: 0.616090, Class: 0.431164, Obj: 0.111269, No Obj: 0.002213, .5R: 0.771429, .75R: 0.171429, count: 35 Region 106 Avg IOU: 0.401710, Class: 0.584862, Obj: 0.023814, No Obj: 0.000491, .5R: 0.400000, .75R: 0.030769, count: 65 Region 82 Avg IOU: 0.626446, Class: 0.804783, Obj: 0.208752, No Obj: 0.002105, .5R: 1.000000, .75R: 0.000000, count: 7 Region 94 Avg IOU: 0.568811, Class: 0.302078, Obj: 0.158489, No Obj: 0.001246, .5R: 0.736842, .75R: 0.052632, count: 19 Region 106 Avg IOU: 0.478135, Class: 0.460701, Obj: 0.022209, No Obj: 0.000592, .5R: 0.527027, .75R: 0.067568, count: 74 Region 82 Avg IOU: 0.645729, Class: 0.481606, Obj: 0.024088, No Obj: 0.000826, .5R: 0.750000, .75R: 0.250000, count: 8 Region 94 Avg IOU: 0.691299, Class: 0.412217, Obj: 0.146995, No Obj: 0.001244, .5R: 0.923077, .75R: 0.230769, count: 13 Region 106 Avg IOU: 0.433459, Class: 0.373143, Obj: 0.065316, No Obj: 0.000620, .5R: 0.380952, .75R: 0.111111, count: 63 Region 82 Avg IOU: 0.643723, Class: 0.735708, Obj: 0.087148, No Obj: 0.001930, .5R: 1.000000, .75R: 0.000000, count: 1 Region 94 Avg IOU: 0.619043, Class: 0.181622, Obj: 0.076533, No Obj: 0.001998, .5R: 0.750000, .75R: 0.250000, count: 4 Region 106 Avg IOU: 0.417721, Class: 0.228522, Obj: 0.057493, No Obj: 0.000623, .5R: 0.411765, .75R: 0.058824, count: 17 Region 82 Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.000688, .5R: -nan, .75R: -nan, count: 0 Region 94 Avg IOU: 0.586769, Class: 0.255652, Obj: 0.092186, No Obj: 0.000962, .5R: 0.818182, .75R: 0.090909, count: 11 Region 106 Avg IOU: 0.387902, Class: 0.172480, Obj: 0.011453, No Obj: 0.000319, .5R: 0.393939, .75R: 0.030303, count: 33 Region 82 Avg IOU: 0.815956, Class: 0.629576, Obj: 0.250676, No Obj: 0.001420, .5R: 1.000000, .75R: 1.000000, count: 2 Region 94 Avg IOU: 0.693611, Class: 0.724990, Obj: 0.134114, No Obj: 0.001178, .5R: 1.000000, .75R: 0.333333, count: 9 Region 106 Avg IOU: 0.385674, Class: 0.424374, Obj: 0.033343, No Obj: 0.000454, .5R: 0.300000, .75R: 0.100000, count: 30 Region 82 Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.000948, .5R: -nan, .75R: -nan, count: 0 Region 94 Avg IOU: 0.735768, Class: 0.449285, Obj: 0.214493, No Obj: 0.001675, .5R: 1.000000, .75R: 0.500000, count: 4 Region 106 Avg IOU: 0.484918, Class: 0.376544, Obj: 0.055268, No Obj: 0.000493, .5R: 0.580645, .75R: 0.129032, count: 31 Region 82 Avg IOU: 0.728501, Class: 0.601891, Obj: 0.188179, No Obj: 0.001866, .5R: 0.900000, .75R: 0.600000, count: 10 Region 94 Avg IOU: 0.627405, Class: 0.385184, Obj: 0.120357, No Obj: 0.001682, .5R: 0.821429, .75R: 0.214286, count: 28 Region 106 Avg IOU: 0.536988, Class: 0.407095, Obj: 0.039204, No Obj: 0.000573, .5R: 0.548387, .75R: 0.161290, count: 31 Region 82 Avg IOU: 0.712608, Class: 0.980018, Obj: 0.333620, No Obj: 0.001282, .5R: 1.000000, .75R: 0.000000, count: 1 Region 94 Avg IOU: 0.524521, Class: 0.636817, Obj: 0.115983, No Obj: 0.001217, .5R: 0.500000, .75R: 0.000000, count: 2 Region 106 Avg IOU: 0.589548, Class: 0.426721, Obj: 0.096697, No Obj: 0.000392, .5R: 0.500000, .75R: 0.500000, count: 4 Region 82 Avg IOU: 0.391772, Class: 0.472847, Obj: 0.128104, No Obj: 0.000676, .5R: 0.500000, .75R: 0.000000, count: 2 Region 94 Avg IOU: 0.665620, Class: 0.670863, Obj: 0.214605, No Obj: 0.002128, .5R: 0.909091, .75R: 0.181818, count: 22 Region 106 Avg IOU: 0.498593, Class: 0.563573, Obj: 0.060426, No Obj: 0.000405, .5R: 0.500000, .75R: 0.000000, count: 10 Region 82 Avg IOU: 0.640983, Class: 0.667879, Obj: 0.081727, No Obj: 0.001109, .5R: 1.000000, .75R: 0.000000, count: 2 Region 94 Avg IOU: 0.644223, Class: 0.542706, Obj: 0.124982, No Obj: 0.001481, .5R: 0.916667, .75R: 0.166667, count: 12 Region 106 Avg IOU: 0.443331, Class: 0.521294, Obj: 0.044477, No Obj: 0.000522, .5R: 0.437500, .75R: 0.125000, count: 16 Region 82 Avg IOU: 0.761693, Class: 0.922725, Obj: 0.149089, No Obj: 0.001441, .5R: 1.000000, .75R: 1.000000, count: 1 Region 94 Avg IOU: 0.451224, Class: 0.506311, Obj: 0.125245, No Obj: 0.001014, .5R: 0.454545, .75R: 0.090909, count: 11 Region 106 Avg IOU: 0.384488, Class: 0.520966, Obj: 0.040560, No Obj: 0.000385, .5R: 0.311111, .75R: 0.022222, count: 45 Region 82 Avg IOU: 0.611993, Class: 0.860623, Obj: 0.098714, No Obj: 0.002143, .5R: 0.666667, .75R: 0.333333, count: 3 Region 94 Avg IOU: 0.560624, Class: 0.449646, Obj: 0.078149, No Obj: 0.001093, .5R: 0.666667, .75R: 0.000000, count: 3 Region 106 Avg IOU: 0.683319, Class: 0.189246, Obj: 0.019282, No Obj: 0.000390, .5R: 1.000000, .75R: 0.000000, count: 1 Region 82 Avg IOU: 0.739426, Class: 0.535030, Obj: 0.221267, No Obj: 0.002459, .5R: 1.000000, .75R: 0.000000, count: 2 Region 94 Avg IOU: 0.677726, Class: 0.366324, Obj: 0.069762, No Obj: 0.001453, .5R: 0.500000, .75R: 0.500000, count: 4 Region 106 Avg IOU: 0.515406, Class: 0.500924, Obj: 0.044172, No Obj: 0.000519, .5R: 0.413793, .75R: 0.068966, count: 29 Region 82 Avg IOU: 0.655260, Class: 0.469090, Obj: 0.267196, No Obj: 0.001518, .5R: 0.800000, .75R: 0.400000, count: 5 Region 94 Avg IOU: 0.668547, Class: 0.553302, Obj: 0.150471, No Obj: 0.001602, .5R: 0.842105, .75R: 0.315789, count: 19 Region 106 Avg IOU: 0.460187, Class: 0.546342, Obj: 0.037805, No Obj: 0.000553, .5R: 0.467742, .75R: 0.080645, count: 62

AlexeyAB commented 6 years ago

Show your [net] section from cfg-file.

Karthik-Suresh93 commented 6 years ago
[net]
# Testing
# batch=1
# subdivisions=1
# Training
batch=64
subdivisions=16
width=416
height=416
channels=3
momentum=0.9
decay=0.0005
angle=0
saturation = 1.5
exposure = 1.5
hue=.1

learning_rate=0.001
burn_in=1000
max_batches = 50200
policy=steps
steps=40000,45000
scales=.1,.1
Karthik-Suresh93 commented 6 years ago

After 8000 iterations the loss is still ~6 and the map is 0.01%. Should I stop training or wait for some more iterations?

AlexeyAB commented 6 years ago

It's definitely your dataset is wrong, use: https://github.com/AlexeyAB/Yolo_mark

Karthik-Suresh93 commented 6 years ago

I recently found that my dataset has a lot of very small objects (<10 pixels width and height). Could this be the reason for the model not converging? Please let me know if these small objects somehow affect the loss function value making the model diverge

AlexeyAB commented 6 years ago

anchors = 4.1444,6.3716, 6.1866,14.4790, 15.3169,12.6948, 11.5459,26.3693, 29.4176,24.0713, 20.5760,43.9119, 55.5021,42.8288, 33.4039,72.5790, 77.2426,103.3075

I recently found that my dataset has a lot of very small objects (<10 pixels width and height). Could this be the reason for the model not converging? Please let me know if these small objects somehow affect the loss function value making the model diverge

  1. Try to increase your width=832 height=832, so what avg loss will be after 2000-4000 iterations?

  2. Did you check your dataset using Yolo_mark? https://github.com/AlexeyAB/Yolo_mark

Karthik-Suresh93 commented 6 years ago
  1. I will increase the resolution and get back to you
  2. Yes, I checked a few images and the annotations seem to match. I could not check all of them manually
  3. There are a few bad annotations which I found though, I removed them. (area of the boudning box~0)