Open devjaynemorais opened 5 years ago
@devjaynemorais Hi,
Read how to get mAP on Pascal VOC and Yolo v2: https://github.com/AlexeyAB/darknet/tree/47c7af1cea5bbdedf1184963355e6418cb8b1b4f#how-to-calculate-map-on-pascalvoc-2007
I need to use version 2 of YOLO. I'm already in training interaction 8780 with yolov2-voc, but I get mAP values greater than 79.20% in training log output. !./darknet detector train cfg/voc.data cfg/yolov2-voc.cfg backup/yolov2-voc_last.weights
(next mAP calculation at 8780 iterations) 8779: 0.503959, 0.592238 avg loss, 0.001000 rate, 1.744993 seconds, 561856 images Loaded: 0.000053 seconds
I don't see mAP=79.20% in your output log.
Can you show screenshot with 79.20% ?
When I try to calculate the map with the !./darknet detector map cfg/voc.data cfg/yolov2-voc.cfg backup/yolov2-voc_last.weights -gpus 0 command it is taking too long. :(
You must wait.
@AlexeyAB, thanks for the response, my output is automatically generated at the end of every interaction with that mAP.
The 79.20% has already been upgraded to 81.31% because my training still continues to run and this value is growing, see below the output:
rank = 188000 of ranks = 188062 class_id = 0, name = aeroplane, ap = 75.02% (TP = 210, FP = 84) class_id = 1, name = bicycle, ap = 91.26% (TP = 292, FP = 87) class_id = 2, name = bird, ap = 81.67% (TP = 373, FP = 126) class_id = 3, name = boat, ap = 70.49% (TP = 188, FP = 130) class_id = 4, name = bottle, ap = 47.63% (TP = 249, FP = 289) class_id = 5, name = bus, ap = 89.71% (TP = 183, FP = 63) class_id = 6, name = car, ap = 85.52% (TP = 995, FP = 356) class_id = 7, name = cat, ap = 95.94% (TP = 330, FP = 47) class_id = 8, name = chair, ap = 70.06% (TP = 530, FP = 486) class_id = 9, name = cow, ap = 86.88% (TP = 208, FP = 56) class_id = 10, name = diningtable, ap = 77.65% (TP = 175, FP = 147) class_id = 11, name = dog, ap = 94.02% (TP = 431, FP = 68) class_id = 12, name = horse, ap = 93.83% (TP = 297, FP = 48) class_id = 13, name = motorbike, ap = 88.27% (TP = 265, FP = 120) class_id = 14, name = person, ap = 77.25% (TP = 3381, FP = 1247) class_id = 15, name = pottedplant, ap = 52.97% (TP = 292, FP = 403) class_id = 16, name = sheep, ap = 83.02% (TP = 201, FP = 58) class_id = 17, name = sofa, ap = 91.59% (TP = 218, FP = 116) class_id = 18, name = train, ap = 92.46% (TP = 259, FP = 69) class_id = 19, name = tvmonitor, ap = 81.00% (TP = 259, FP = 158)
for conf_thresh = 0.25, precision = 0.69, recall = 0.78, F1-score = 0.73 for conf_thresh = 0.25, TP = 9336, FP = 4158, FN = 2696, average IoU = 52.15 %
IoU threshold = 50 %, used Area-Under-Curve for each unique Recall mean average precision (mAP@0.50) = 0.813123, or 81.31 %
Set -points flag:
-points 101
for MS COCO-points 11
for PascalVOC 2007 (uncommentdifficult
in voc.data)-points 0
(AUC) for ImageNet, PascalVOC 2010-2012, your custom datasetmean_average_precision (mAP@0.5) = 0.813123 New best mAP! Resizing 320 x 320 try to allocate additional workspace_size = 131.08 MB CUDA allocate done! Loaded: 0.000048 seconds Region Avg IOU: 0.656107, Class: 0.871432, Obj: 0.538800, No Obj: 0.009718, Avg Recall: 0.687500, count: 16 Region Avg IOU: 0.763814, Class: 0.899099, Obj: 0.552278, No Obj: 0.013543, Avg Recall: 0.937500, count: 16 Region Avg IOU: 0.807524, Class: 0.910493, Obj: 0.675922, No Obj: 0.012295, Avg Recall: 1.000000, count: 11 Region Avg IOU: 0.574085, Class: 0.888949, Obj: 0.379296, No Obj: 0.010719, Avg Recall: 0.695652, count: 23
**This value must be the same as that calculated manually by the command described in https://github.com/AlexeyAB/darknet/tree/47c7af1cea5bbdedf1184963355e6418cb8b1b4f#how-to-calculate-map-on-pascalvoc-2007 ?
This value is calculated with mAP@0.5, mAP@0.75 or mAP@0.95?**
For PascalVOC is used mAP@0.5
Can you show content of obj.data
file?
Can you attach your cfg-file?
Here they are:
!cat cfg/yolov2-voc.cfg
[net]
# Testing
# batch=1
# subdivisions=1
# Training
batch=64
subdivisions=8
height=416
width=416
channels=3
momentum=0.9
decay=0.0005
angle=0
saturation = 1.5
exposure = 1.5
hue=.1
learning_rate=0.001
burn_in=1000
max_batches = 80200
policy=steps
steps=40000,60000
scales=.1,.1
[convolutional]
batch_normalize=1
filters=32
size=3
stride=1
pad=1
activation=leaky
[maxpool]
size=2
stride=2
[convolutional]
batch_normalize=1
filters=64
size=3
stride=1
pad=1
activation=leaky
[maxpool]
size=2
stride=2
[convolutional]
batch_normalize=1
filters=128
size=3
stride=1
pad=1
activation=leaky
[convolutional]
batch_normalize=1
filters=64
size=1
stride=1
pad=1
activation=leaky
[convolutional]
batch_normalize=1
filters=128
size=3
stride=1
pad=1
activation=leaky
[maxpool]
size=2
stride=2
[convolutional]
batch_normalize=1
filters=256
size=3
stride=1
pad=1
activation=leaky
[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=leaky
[convolutional]
batch_normalize=1
filters=256
size=3
stride=1
pad=1
activation=leaky
[maxpool]
size=2
stride=2
[convolutional]
batch_normalize=1
filters=512
size=3
stride=1
pad=1
activation=leaky
[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky
[convolutional]
batch_normalize=1
filters=512
size=3
stride=1
pad=1
activation=leaky
[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky
[convolutional]
batch_normalize=1
filters=512
size=3
stride=1
pad=1
activation=leaky
[maxpool]
size=2
stride=2
[convolutional]
batch_normalize=1
filters=1024
size=3
stride=1
pad=1
activation=leaky
[convolutional]
batch_normalize=1
filters=512
size=1
stride=1
pad=1
activation=leaky
[convolutional]
batch_normalize=1
filters=1024
size=3
stride=1
pad=1
activation=leaky
[convolutional]
batch_normalize=1
filters=512
size=1
stride=1
pad=1
activation=leaky
[convolutional]
batch_normalize=1
filters=1024
size=3
stride=1
pad=1
activation=leaky
#######
[convolutional]
batch_normalize=1
size=3
stride=1
pad=1
filters=1024
activation=leaky
[convolutional]
batch_normalize=1
size=3
stride=1
pad=1
filters=1024
activation=leaky
[route]
layers=-9
[convolutional]
batch_normalize=1
size=1
stride=1
pad=1
filters=64
activation=leaky
[reorg]
stride=2
[route]
layers=-1,-4
[convolutional]
batch_normalize=1
size=3
stride=1
pad=1
filters=1024
activation=leaky
[convolutional]
size=1
stride=1
pad=1
filters=125
activation=linear
[region]
anchors = 1.3221, 1.73145, 3.19275, 4.00944, 5.05587, 8.09892, 9.47112, 4.84053, 11.2364, 10.0071
bias_match=1
classes=20
coords=4
num=5
softmax=1
jitter=.3
rescore=1
object_scale=5
noobject_scale=1
class_scale=1
coord_scale=1
absolute=1
thresh = .6
random=1
!cat cfg/voc.data
classes= 20
train = train.txt
valid = 2007_test.txt
names = data/voc.names
backup = backup
I just calculated the mAP with the command !./darknet detector map cfg/voc.data cfg/yolov2-voc.cfg backup/yolov2-voc_last.weights -gpus 0
and got the results again, but in the part of the readme 'When I should stop training' it says that the stop time is when avg stops decrementing. In my case, it is still high.
(next mAP calculation at 7634 iterations)
6919: 0.687181, **0.828664 avg loss**, 0.001000 rate, 5.223025 seconds, 442816 images
Loaded: 0.000041 seconds
0
compute_capability = 750, cudnn_half = 1
layer filters size/strd(dil) input output
0 conv 32 3 x 3/ 1 416 x 416 x 3 -> 416 x 416 x 32 0.299 BF
1 max 2 x 2/ 2 416 x 416 x 32 -> 208 x 208 x 32 0.006 BF
2 conv 64 3 x 3/ 1 208 x 208 x 32 -> 208 x 208 x 64 1.595 BF
3 max 2 x 2/ 2 208 x 208 x 64 -> 104 x 104 x 64 0.003 BF
4 conv 128 3 x 3/ 1 104 x 104 x 64 -> 104 x 104 x 128 1.595 BF
5 conv 64 1 x 1/ 1 104 x 104 x 128 -> 104 x 104 x 64 0.177 BF
6 conv 128 3 x 3/ 1 104 x 104 x 64 -> 104 x 104 x 128 1.595 BF
7 max 2 x 2/ 2 104 x 104 x 128 -> 52 x 52 x 128 0.001 BF
8 conv 256 3 x 3/ 1 52 x 52 x 128 -> 52 x 52 x 256 1.595 BF
9 conv 128 1 x 1/ 1 52 x 52 x 256 -> 52 x 52 x 128 0.177 BF
10 conv 256 3 x 3/ 1 52 x 52 x 128 -> 52 x 52 x 256 1.595 BF
11 max 2 x 2/ 2 52 x 52 x 256 -> 26 x 26 x 256 0.001 BF
12 conv 512 3 x 3/ 1 26 x 26 x 256 -> 26 x 26 x 512 1.595 BF
13 conv 256 1 x 1/ 1 26 x 26 x 512 -> 26 x 26 x 256 0.177 BF
14 conv 512 3 x 3/ 1 26 x 26 x 256 -> 26 x 26 x 512 1.595 BF
15 conv 256 1 x 1/ 1 26 x 26 x 512 -> 26 x 26 x 256 0.177 BF
16 conv 512 3 x 3/ 1 26 x 26 x 256 -> 26 x 26 x 512 1.595 BF
17 max 2 x 2/ 2 26 x 26 x 512 -> 13 x 13 x 512 0.000 BF
18 conv 1024 3 x 3/ 1 13 x 13 x 512 -> 13 x 13 x1024 1.595 BF
19 conv 512 1 x 1/ 1 13 x 13 x1024 -> 13 x 13 x 512 0.177 BF
20 conv 1024 3 x 3/ 1 13 x 13 x 512 -> 13 x 13 x1024 1.595 BF
21 conv 512 1 x 1/ 1 13 x 13 x1024 -> 13 x 13 x 512 0.177 BF
22 conv 1024 3 x 3/ 1 13 x 13 x 512 -> 13 x 13 x1024 1.595 BF
23 conv 1024 3 x 3/ 1 13 x 13 x1024 -> 13 x 13 x1024 3.190 BF
24 conv 1024 3 x 3/ 1 13 x 13 x1024 -> 13 x 13 x1024 3.190 BF
25 route 16
26 conv 64 1 x 1/ 1 26 x 26 x 512 -> 26 x 26 x 64 0.044 BF
27
reorg_old
reorg_old / 2 26 x 26 x 64 -> 13 x 13 x 256
28 route 27 24
29 conv 1024 3 x 3/ 1 13 x 13 x1280 -> 13 x 13 x1024 3.987 BF
30 conv 125 1 x 1/ 1 13 x 13 x1024 -> 13 x 13 x 125 0.043 BF
31 detection
mask_scale: Using default '1.000000'
Total BFLOPS 29.371
Allocate additional workspace_size = 131.08 MB
Loading weights from backup/yolov2-voc_last.weights...
seen 64
Done!
calculation mAP (mean average precision)...
4952
detections_count = 216098, unique_truth_count = 12032
class_id = 0, name = aeroplane, ap = 67.88% (TP = 197, FP = 153)
class_id = 1, name = bicycle, ap = 86.24% (TP = 278, FP = 190)
class_id = 2, name = bird, ap = 78.26% (TP = 360, FP = 221)
class_id = 3, name = boat, ap = 69.78% (TP = 193, FP = 153)
class_id = 4, name = bottle, ap = 42.33% (TP = 220, FP = 254)
class_id = 5, name = bus, ap = 89.22% (TP = 184, FP = 64)
class_id = 6, name = car, ap = 84.62% (TP = 993, FP = 507)
class_id = 7, name = cat, ap = 95.60% (TP = 320, FP = 52)
class_id = 8, name = chair, ap = 66.81% (TP = 533, FP = 1031)
class_id = 9, name = cow, ap = 80.01% (TP = 205, FP = 190)
class_id = 10, name = diningtable, ap = 80.01% (TP = 170, FP = 155)
class_id = 11, name = dog, ap = 94.38% (TP = 425, FP = 119)
class_id = 12, name = horse, ap = 92.53% (TP = 309, FP = 118)
class_id = 13, name = motorbike, ap = 89.39% (TP = 270, FP = 98)
class_id = 14, name = person, ap = 76.86% (TP = 3502, FP = 2295)
class_id = 15, name = pottedplant, ap = 53.31% (TP = 285, FP = 355)
class_id = 16, name = sheep, ap = 82.53% (TP = 199, FP = 83)
class_id = 17, name = sofa, ap = 90.61% (TP = 214, FP = 100)
class_id = 18, name = train, ap = 92.92% (TP = 258, FP = 53)
class_id = 19, name = tvmonitor, ap = 79.70% (TP = 256, FP = 230)
for conf_thresh = 0.25, precision = 0.59, recall = 0.78, F1-score = 0.67
for conf_thresh = 0.25, TP = 9371, FP = 6421, FN = 2661, average IoU = 44.33 %
IoU threshold = 50 %, used Area-Under-Curve for each unique Recall
mean average precision (mAP@0.50) = 0.796496, or **79.65 %**
Total Detection Time: 220.000000 Seconds
Set -points flag:
`-points 101` for MS COCO
`-points 11` for PascalVOC 2007 (uncomment `difficult` in voc.data)
`-points 0` (AUC) for ImageNet, PascalVOC 2010-2012, your custom dataset
@AlexeyAB are my configuration files correct?
are my configuration files correct?
Yes.
classes= 20 train = train.txt valid = 2007_test.txt names = data/voc.names backup = backup
How did you get train.txt? It should not include images from 2007_test.txt
add also this un-commented line to your obj.data file: https://github.com/AlexeyAB/darknet/blob/5ec35922d5215e11466a9bb1602f81d1746ccbe5/build/darknet/x64/data/voc.data#L4
Try to check the mAP by using these commands: https://github.com/AlexeyAB/darknet/blob/47c7af1cea5bbdedf1184963355e6418cb8b1b4f/build/darknet/x64/calc_mAP_voc_py.cmd#L8-L11
as described here: https://github.com/AlexeyAB/darknet/tree/47c7af1cea5bbdedf1184963355e6418cb8b1b4f#how-to-calculate-map-on-pascalvoc-2007
Hello @AlexeyAB
I finished the training following your guidelines. I obtained a 76.25% mAP. :D
0
reorg_old
seen 64
calculation mAP (mean average precision)...
detections_count = 117114, unique_truth_count = 12032
class_id = 0, name = aeroplane, ap = 77.43% (TP = 218, FP = 117)
class_id = 1, name = bicycle, ap = 85.24% (TP = 270, FP = 84)
class_id = 2, name = bird, ap = 74.29% (TP = 338, FP = 154)
class_id = 3, name = boat, ap = 65.76% (TP = 182, FP = 167)
class_id = 4, name = bottle, ap = 45.05% (TP = 226, FP = 265)
class_id = 5, name = bus, ap = 81.97% (TP = 165, FP = 60)
class_id = 6, name = car, ap = 83.98% (TP = 1005, FP = 526)
class_id = 7, name = cat, ap = 90.74% (TP = 309, FP = 72)
class_id = 8, name = chair, ap = 57.82% (TP = 473, FP = 788)
class_id = 9, name = cow, ap = 80.08% (TP = 200, FP = 138)
class_id = 10, name = diningtable, ap = 76.47% (TP = 164, FP = 135)
class_id = 11, name = dog, ap = 87.72% (TP = 408, FP = 146)
class_id = 12, name = horse, ap = 86.73% (TP = 290, FP = 80)
class_id = 13, name = motorbike, ap = 84.36% (TP = 263, FP = 85)
class_id = 14, name = person, ap = 76.94% (TP = 3447, FP = 1706)
class_id = 15, name = pottedplant, ap = 47.27% (TP = 256, FP = 370)
class_id = 16, name = sheep, ap = 78.37% (TP = 196, FP = 120)
class_id = 17, name = sofa, ap = 79.00% (TP = 189, FP = 257)
class_id = 18, name = train, ap = 88.34% (TP = 253, FP = 92)
class_id = 19, name = tvmonitor, ap = 77.35% (TP = 242, FP = 157)
for conf_thresh = 0.25, precision = 0.62, recall = 0.76, F1-score = 0.68
for conf_thresh = 0.25, TP = 9094, FP = 5519, FN = 2938, average IoU = 48.06 %
IoU threshold = 50 %, used Area-Under-Curve for each unique Recall
mean average precision (mAP@0.50) = 0.762457, or **76.25 %**
Set -points flag:
`-points 101` for MS COCO
`-points 11` for PascalVOC 2007 (uncomment `difficult` in voc.data)
`-points 0` (AUC) for ImageNet, PascalVOC 2010-2012, your custom dataset
Thank you for the directions, they were very helpful.
1) would you like to know how to calculate mAP by image size?
2) And the FPS, where do I see the result by the size of the image?
As is shown in table 3 on page 4 of the paper [YOLO 9000: Better, Faster, Stronger] (https://arxiv.org/pdf/1612.08242.pdf) .
Detection Frameworks | Train | mAP | FPS |
---|---|---|---|
YOLOv2 288 × 288 | 2007 + 2012 | 69.0 | 91 |
YOLOv2 352 × 352 | 2007 + 2012 | 73.7 | 81 |
YOLOv2 416 × 416 | 2007 + 2012 | 76.8 | 67 |
YOLOv2 480 × 480 | 2007 + 2012 | 77.8 | 59 |
YOLOv2 544 × 544 | 2007 + 2012 | 78.6 | 40 |
@devjaynemorais
1) would you like to know how to calculate mAP by image size?
What do you mean?
2) And the FPS, where do I see the result by the size of the image?
./darknet detector demo ... test.mp4
on some videofile, and you will see FPS in console@AlexeyAB Sorry for the typo.
In this table, there is a different mAP for each resolution/size 288 × 288, 352 × 352, 416 × 416, 480 × 480 and 544 × 544. I would like to know how calculate the mAP for every image size/resolution (in the same way that was done to put the values in the table 3 above). How do I find these mAP values?
Just change width and height in cfg-file
Thank you so much. =D
Hello, I would like to know what mAP (@0.5, @0.75, @0.95, ...) that was used to put in table 3 on page 4 of the paper YOLO 9000: Better, Faster, Stronger.
Please, explain how to get the approximate value of this mAP and FPS.
I need to use version 2 of YOLO. I'm already in training interaction 8780 with yolov2-voc, but I get mAP values greater than 79.20% in training log output.
!./darknet detector train cfg/voc.data cfg/yolov2-voc.cfg backup/yolov2-voc_last.weights
When I try to calculate the map with the
!./darknet detector map cfg/voc.data cfg/yolov2-voc.cfg backup/yolov2-voc_last.weights -gpus 0
command it is taking too long. :(Thank you.