AlexeyAB / darknet

YOLOv4 / Scaled-YOLOv4 / YOLO - Neural Networks for Object Detection (Windows and Linux version of Darknet )
http://pjreddie.com/darknet/
Other
21.76k stars 7.96k forks source link

[HELP] mAP Calculation #3488

Open devjaynemorais opened 5 years ago

devjaynemorais commented 5 years ago

Hello, I would like to know what mAP (@0.5, @0.75, @0.95, ...) that was used to put in table 3 on page 4 of the paper YOLO 9000: Better, Faster, Stronger.

Detection Frameworks Train mAP FPS
YOLOv2 416 × 416 2007 + 2012 76.8 67

Please, explain how to get the approximate value of this mAP and FPS.

I need to use version 2 of YOLO. I'm already in training interaction 8780 with yolov2-voc, but I get mAP values ​​greater than 79.20% in training log output. !./darknet detector train cfg/voc.data cfg/yolov2-voc.cfg backup/yolov2-voc_last.weights

(next mAP calculation at 8780 iterations) 8779: 0.503959, 0.592238 avg loss, 0.001000 rate, 1.744993 seconds, 561856 images Loaded: 0.000053 seconds

When I try to calculate the map with the !./darknet detector map cfg/voc.data cfg/yolov2-voc.cfg backup/yolov2-voc_last.weights -gpus 0 command it is taking too long. :(

Thank you.

AlexeyAB commented 5 years ago

@devjaynemorais Hi,

Read how to get mAP on Pascal VOC and Yolo v2: https://github.com/AlexeyAB/darknet/tree/47c7af1cea5bbdedf1184963355e6418cb8b1b4f#how-to-calculate-map-on-pascalvoc-2007

I need to use version 2 of YOLO. I'm already in training interaction 8780 with yolov2-voc, but I get mAP values ​​greater than 79.20% in training log output. !./darknet detector train cfg/voc.data cfg/yolov2-voc.cfg backup/yolov2-voc_last.weights

(next mAP calculation at 8780 iterations) 8779: 0.503959, 0.592238 avg loss, 0.001000 rate, 1.744993 seconds, 561856 images Loaded: 0.000053 seconds

I don't see mAP=79.20% in your output log.

Can you show screenshot with 79.20% ?

When I try to calculate the map with the !./darknet detector map cfg/voc.data cfg/yolov2-voc.cfg backup/yolov2-voc_last.weights -gpus 0 command it is taking too long. :(

You must wait.

devjaynemorais commented 5 years ago

@AlexeyAB, thanks for the response, my output is automatically generated at the end of every interaction with that mAP.

The 79.20% has already been upgraded to 81.31% because my training still continues to run and this value is growing, see below the output:

rank = 188000 of ranks = 188062 class_id = 0, name = aeroplane, ap = 75.02% (TP = 210, FP = 84) class_id = 1, name = bicycle, ap = 91.26% (TP = 292, FP = 87) class_id = 2, name = bird, ap = 81.67% (TP = 373, FP = 126) class_id = 3, name = boat, ap = 70.49% (TP = 188, FP = 130) class_id = 4, name = bottle, ap = 47.63% (TP = 249, FP = 289) class_id = 5, name = bus, ap = 89.71% (TP = 183, FP = 63) class_id = 6, name = car, ap = 85.52% (TP = 995, FP = 356) class_id = 7, name = cat, ap = 95.94% (TP = 330, FP = 47) class_id = 8, name = chair, ap = 70.06% (TP = 530, FP = 486) class_id = 9, name = cow, ap = 86.88% (TP = 208, FP = 56) class_id = 10, name = diningtable, ap = 77.65% (TP = 175, FP = 147) class_id = 11, name = dog, ap = 94.02% (TP = 431, FP = 68) class_id = 12, name = horse, ap = 93.83% (TP = 297, FP = 48) class_id = 13, name = motorbike, ap = 88.27% (TP = 265, FP = 120) class_id = 14, name = person, ap = 77.25% (TP = 3381, FP = 1247) class_id = 15, name = pottedplant, ap = 52.97% (TP = 292, FP = 403) class_id = 16, name = sheep, ap = 83.02% (TP = 201, FP = 58) class_id = 17, name = sofa, ap = 91.59% (TP = 218, FP = 116) class_id = 18, name = train, ap = 92.46% (TP = 259, FP = 69) class_id = 19, name = tvmonitor, ap = 81.00% (TP = 259, FP = 158)

for conf_thresh = 0.25, precision = 0.69, recall = 0.78, F1-score = 0.73 for conf_thresh = 0.25, TP = 9336, FP = 4158, FN = 2696, average IoU = 52.15 %

IoU threshold = 50 %, used Area-Under-Curve for each unique Recall mean average precision (mAP@0.50) = 0.813123, or 81.31 %

Set -points flag: -points 101 for MS COCO -points 11 for PascalVOC 2007 (uncomment difficult in voc.data) -points 0 (AUC) for ImageNet, PascalVOC 2010-2012, your custom dataset

mean_average_precision (mAP@0.5) = 0.813123 New best mAP! Resizing 320 x 320 try to allocate additional workspace_size = 131.08 MB CUDA allocate done! Loaded: 0.000048 seconds Region Avg IOU: 0.656107, Class: 0.871432, Obj: 0.538800, No Obj: 0.009718, Avg Recall: 0.687500, count: 16 Region Avg IOU: 0.763814, Class: 0.899099, Obj: 0.552278, No Obj: 0.013543, Avg Recall: 0.937500, count: 16 Region Avg IOU: 0.807524, Class: 0.910493, Obj: 0.675922, No Obj: 0.012295, Avg Recall: 1.000000, count: 11 Region Avg IOU: 0.574085, Class: 0.888949, Obj: 0.379296, No Obj: 0.010719, Avg Recall: 0.695652, count: 23

**This value must be the same as that calculated manually by the command described in https://github.com/AlexeyAB/darknet/tree/47c7af1cea5bbdedf1184963355e6418cb8b1b4f#how-to-calculate-map-on-pascalvoc-2007 ?

This value is calculated with mAP@0.5, mAP@0.75 or mAP@0.95?**

AlexeyAB commented 5 years ago

For PascalVOC is used mAP@0.5

Can you show content of obj.data file?

Can you attach your cfg-file?

devjaynemorais commented 5 years ago

Here they are:

!cat cfg/yolov2-voc.cfg

[net]
# Testing
# batch=1
# subdivisions=1
# Training
batch=64
subdivisions=8
height=416
width=416
channels=3
momentum=0.9
decay=0.0005
angle=0
saturation = 1.5
exposure = 1.5
hue=.1

learning_rate=0.001
burn_in=1000
max_batches = 80200
policy=steps
steps=40000,60000
scales=.1,.1

[convolutional]
batch_normalize=1
filters=32
size=3
stride=1
pad=1
activation=leaky

[maxpool]
size=2
stride=2

[convolutional]
batch_normalize=1
filters=64
size=3
stride=1
pad=1
activation=leaky

[maxpool]
size=2
stride=2

[convolutional]
batch_normalize=1
filters=128
size=3
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=64
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=128
size=3
stride=1
pad=1
activation=leaky

[maxpool]
size=2
stride=2

[convolutional]
batch_normalize=1
filters=256
size=3
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=256
size=3
stride=1
pad=1
activation=leaky

[maxpool]
size=2
stride=2

[convolutional]
batch_normalize=1
filters=512
size=3
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=512
size=3
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=512
size=3
stride=1
pad=1
activation=leaky

[maxpool]
size=2
stride=2

[convolutional]
batch_normalize=1
filters=1024
size=3
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=512
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=1024
size=3
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=512
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=1024
size=3
stride=1
pad=1
activation=leaky

#######

[convolutional]
batch_normalize=1
size=3
stride=1
pad=1
filters=1024
activation=leaky

[convolutional]
batch_normalize=1
size=3
stride=1
pad=1
filters=1024
activation=leaky

[route]
layers=-9

[convolutional]
batch_normalize=1
size=1
stride=1
pad=1
filters=64
activation=leaky

[reorg]
stride=2

[route]
layers=-1,-4

[convolutional]
batch_normalize=1
size=3
stride=1
pad=1
filters=1024
activation=leaky

[convolutional]
size=1
stride=1
pad=1
filters=125
activation=linear

[region]
anchors =  1.3221, 1.73145, 3.19275, 4.00944, 5.05587, 8.09892, 9.47112, 4.84053, 11.2364, 10.0071
bias_match=1
classes=20
coords=4
num=5
softmax=1
jitter=.3
rescore=1

object_scale=5
noobject_scale=1
class_scale=1
coord_scale=1

absolute=1
thresh = .6
random=1

!cat cfg/voc.data

classes= 20
train  = train.txt
valid  = 2007_test.txt
names = data/voc.names
backup = backup
devjaynemorais commented 5 years ago

I just calculated the mAP with the command !./darknet detector map cfg/voc.data cfg/yolov2-voc.cfg backup/yolov2-voc_last.weights -gpus 0 and got the results again, but in the part of the readme 'When I should stop training' it says that the stop time is when avg stops decrementing. In my case, it is still high.

(next mAP calculation at 7634 iterations) 
 6919: 0.687181, **0.828664 avg loss**, 0.001000 rate, 5.223025 seconds, 442816 images
Loaded: 0.000041 seconds
0
 compute_capability = 750, cudnn_half = 1 
   layer   filters  size/strd(dil)      input                output
   0 conv     32      3 x 3/ 1    416 x 416 x   3  ->  416 x 416 x  32 0.299 BF
   1 max              2 x 2/ 2    416 x 416 x  32 ->  208 x 208 x  32 0.006 BF
   2 conv     64      3 x 3/ 1    208 x 208 x  32  ->  208 x 208 x  64 1.595 BF
   3 max              2 x 2/ 2    208 x 208 x  64 ->  104 x 104 x  64 0.003 BF
   4 conv    128      3 x 3/ 1    104 x 104 x  64  ->  104 x 104 x 128 1.595 BF
   5 conv     64      1 x 1/ 1    104 x 104 x 128  ->  104 x 104 x  64 0.177 BF
   6 conv    128      3 x 3/ 1    104 x 104 x  64  ->  104 x 104 x 128 1.595 BF
   7 max              2 x 2/ 2    104 x 104 x 128 ->   52 x  52 x 128 0.001 BF
   8 conv    256      3 x 3/ 1     52 x  52 x 128  ->   52 x  52 x 256 1.595 BF
   9 conv    128      1 x 1/ 1     52 x  52 x 256  ->   52 x  52 x 128 0.177 BF
  10 conv    256      3 x 3/ 1     52 x  52 x 128  ->   52 x  52 x 256 1.595 BF
  11 max              2 x 2/ 2     52 x  52 x 256 ->   26 x  26 x 256 0.001 BF
  12 conv    512      3 x 3/ 1     26 x  26 x 256  ->   26 x  26 x 512 1.595 BF
  13 conv    256      1 x 1/ 1     26 x  26 x 512  ->   26 x  26 x 256 0.177 BF
  14 conv    512      3 x 3/ 1     26 x  26 x 256  ->   26 x  26 x 512 1.595 BF
  15 conv    256      1 x 1/ 1     26 x  26 x 512  ->   26 x  26 x 256 0.177 BF
  16 conv    512      3 x 3/ 1     26 x  26 x 256  ->   26 x  26 x 512 1.595 BF
  17 max              2 x 2/ 2     26 x  26 x 512 ->   13 x  13 x 512 0.000 BF
  18 conv   1024      3 x 3/ 1     13 x  13 x 512  ->   13 x  13 x1024 1.595 BF
  19 conv    512      1 x 1/ 1     13 x  13 x1024  ->   13 x  13 x 512 0.177 BF
  20 conv   1024      3 x 3/ 1     13 x  13 x 512  ->   13 x  13 x1024 1.595 BF
  21 conv    512      1 x 1/ 1     13 x  13 x1024  ->   13 x  13 x 512 0.177 BF
  22 conv   1024      3 x 3/ 1     13 x  13 x 512  ->   13 x  13 x1024 1.595 BF
  23 conv   1024      3 x 3/ 1     13 x  13 x1024  ->   13 x  13 x1024 3.190 BF
  24 conv   1024      3 x 3/ 1     13 x  13 x1024  ->   13 x  13 x1024 3.190 BF
  25 route  16
  26 conv     64      1 x 1/ 1     26 x  26 x 512  ->   26 x  26 x  64 0.044 BF
  27 
 reorg_old 
reorg_old              / 2    26 x  26 x  64   ->    13 x  13 x 256
  28 route  27 24
  29 conv   1024      3 x 3/ 1     13 x  13 x1280  ->   13 x  13 x1024 3.987 BF
  30 conv    125      1 x 1/ 1     13 x  13 x1024  ->   13 x  13 x 125 0.043 BF
  31 detection
mask_scale: Using default '1.000000'
Total BFLOPS 29.371 
 Allocate additional workspace_size = 131.08 MB 
Loading weights from backup/yolov2-voc_last.weights...
 seen 64 
Done!

 calculation mAP (mean average precision)...
4952
 detections_count = 216098, unique_truth_count = 12032  
class_id = 0, name = aeroplane, ap = 67.88%      (TP = 197, FP = 153) 
class_id = 1, name = bicycle, ap = 86.24%        (TP = 278, FP = 190) 
class_id = 2, name = bird, ap = 78.26%       (TP = 360, FP = 221) 
class_id = 3, name = boat, ap = 69.78%       (TP = 193, FP = 153) 
class_id = 4, name = bottle, ap = 42.33%     (TP = 220, FP = 254) 
class_id = 5, name = bus, ap = 89.22%        (TP = 184, FP = 64) 
class_id = 6, name = car, ap = 84.62%        (TP = 993, FP = 507) 
class_id = 7, name = cat, ap = 95.60%        (TP = 320, FP = 52) 
class_id = 8, name = chair, ap = 66.81%      (TP = 533, FP = 1031) 
class_id = 9, name = cow, ap = 80.01%        (TP = 205, FP = 190) 
class_id = 10, name = diningtable, ap = 80.01%       (TP = 170, FP = 155) 
class_id = 11, name = dog, ap = 94.38%       (TP = 425, FP = 119) 
class_id = 12, name = horse, ap = 92.53%     (TP = 309, FP = 118) 
class_id = 13, name = motorbike, ap = 89.39%     (TP = 270, FP = 98) 
class_id = 14, name = person, ap = 76.86%        (TP = 3502, FP = 2295) 
class_id = 15, name = pottedplant, ap = 53.31%       (TP = 285, FP = 355) 
class_id = 16, name = sheep, ap = 82.53%     (TP = 199, FP = 83) 
class_id = 17, name = sofa, ap = 90.61%      (TP = 214, FP = 100) 
class_id = 18, name = train, ap = 92.92%     (TP = 258, FP = 53) 
class_id = 19, name = tvmonitor, ap = 79.70%     (TP = 256, FP = 230) 

 for conf_thresh = 0.25, precision = 0.59, recall = 0.78, F1-score = 0.67 
 for conf_thresh = 0.25, TP = 9371, FP = 6421, FN = 2661, average IoU = 44.33 % 

 IoU threshold = 50 %, used Area-Under-Curve for each unique Recall 
 mean average precision (mAP@0.50) = 0.796496, or **79.65 %** 
Total Detection Time: 220.000000 Seconds

Set -points flag:
 `-points 101` for MS COCO 
 `-points 11` for PascalVOC 2007 (uncomment `difficult` in voc.data) 
 `-points 0` (AUC) for ImageNet, PascalVOC 2010-2012, your custom dataset
devjaynemorais commented 5 years ago

@AlexeyAB are my configuration files correct?

AlexeyAB commented 5 years ago

are my configuration files correct?

Yes.

classes= 20 train = train.txt valid = 2007_test.txt names = data/voc.names backup = backup

as described here: https://github.com/AlexeyAB/darknet/tree/47c7af1cea5bbdedf1184963355e6418cb8b1b4f#how-to-calculate-map-on-pascalvoc-2007

devjaynemorais commented 5 years ago

Hello @AlexeyAB

I finished the training following your guidelines. I obtained a 76.25% mAP. :D

0

 reorg_old 

 seen 64 

 calculation mAP (mean average precision)...

 detections_count = 117114, unique_truth_count = 12032  
class_id = 0, name = aeroplane, ap = 77.43%      (TP = 218, FP = 117) 
class_id = 1, name = bicycle, ap = 85.24%        (TP = 270, FP = 84) 
class_id = 2, name = bird, ap = 74.29%       (TP = 338, FP = 154) 
class_id = 3, name = boat, ap = 65.76%       (TP = 182, FP = 167) 
class_id = 4, name = bottle, ap = 45.05%     (TP = 226, FP = 265) 
class_id = 5, name = bus, ap = 81.97%        (TP = 165, FP = 60) 
class_id = 6, name = car, ap = 83.98%        (TP = 1005, FP = 526) 
class_id = 7, name = cat, ap = 90.74%        (TP = 309, FP = 72) 
class_id = 8, name = chair, ap = 57.82%      (TP = 473, FP = 788) 
class_id = 9, name = cow, ap = 80.08%        (TP = 200, FP = 138) 
class_id = 10, name = diningtable, ap = 76.47%       (TP = 164, FP = 135) 
class_id = 11, name = dog, ap = 87.72%       (TP = 408, FP = 146) 
class_id = 12, name = horse, ap = 86.73%     (TP = 290, FP = 80) 
class_id = 13, name = motorbike, ap = 84.36%     (TP = 263, FP = 85) 
class_id = 14, name = person, ap = 76.94%        (TP = 3447, FP = 1706) 
class_id = 15, name = pottedplant, ap = 47.27%       (TP = 256, FP = 370) 
class_id = 16, name = sheep, ap = 78.37%     (TP = 196, FP = 120) 
class_id = 17, name = sofa, ap = 79.00%      (TP = 189, FP = 257) 
class_id = 18, name = train, ap = 88.34%     (TP = 253, FP = 92) 
class_id = 19, name = tvmonitor, ap = 77.35%     (TP = 242, FP = 157) 

 for conf_thresh = 0.25, precision = 0.62, recall = 0.76, F1-score = 0.68 
 for conf_thresh = 0.25, TP = 9094, FP = 5519, FN = 2938, average IoU = 48.06 % 

 IoU threshold = 50 %, used Area-Under-Curve for each unique Recall 
 mean average precision (mAP@0.50) = 0.762457, or **76.25 %** 

Set -points flag:
 `-points 101` for MS COCO 
 `-points 11` for PascalVOC 2007 (uncomment `difficult` in voc.data) 
 `-points 0` (AUC) for ImageNet, PascalVOC 2010-2012, your custom dataset

Thank you for the directions, they were very helpful.

Lastly:

1) would you like to know how to calculate mAP by image size?

2) And the FPS, where do I see the result by the size of the image?

As is shown in table 3 on page 4 of the paper [YOLO 9000: Better, Faster, Stronger] (https://arxiv.org/pdf/1612.08242.pdf) .

Detection Frameworks Train mAP FPS
YOLOv2 288 × 288 2007 + 2012 69.0 91
YOLOv2 352 × 352 2007 + 2012 73.7 81
YOLOv2 416 × 416 2007 + 2012 76.8 67
YOLOv2 480 × 480 2007 + 2012 77.8 59
YOLOv2 544 × 544 2007 + 2012 78.6 40
AlexeyAB commented 5 years ago

@devjaynemorais

1) would you like to know how to calculate mAP by image size?

What do you mean?

2) And the FPS, where do I see the result by the size of the image?

devjaynemorais commented 5 years ago

@AlexeyAB Sorry for the typo.

In this table, there is a different mAP for each resolution/size 288 × 288, 352 × 352, 416 × 416, 480 × 480 and 544 × 544. I would like to know how calculate the mAP for every image size/resolution (in the same way that was done to put the values ​​in the table 3 above). How do I find these mAP values?

AlexeyAB commented 5 years ago

Just change width and height in cfg-file

devjaynemorais commented 5 years ago

Thank you so much. =D