Open ndcuong91 opened 6 years ago
Yes, that's a great idea! Any recommendations/idea of how this should be implemented?
Idea 0
: We could add an optional difficult
on the end of each line of the ground-truth files. The ground-truth files would have the following format (the difficult objects in the ground-truth would be ignored):
<class_name> <left> <top> <right> <bottom> [<difficult>]
Example of a ground-truth file using Idea 0
:
tvmonitor 2 10 173 238 difficult
book 439 157 556 241
book 437 246 518 351
pottedplant 272 190 316 259
What do you think?
@Cartucho that's an acceptable solution. Do you have any plan to implement this? Btw, i think the processing time of this tool is not good enough. When i tested on VOC 2007, it took about 15 minutes to get final result (i already turn off save image's function and also animation). But when i used AlexeyAB's tool (https://github.com/AlexeyAB/darknet#how-to-calculate-map-on-pascalvoc-2007) i need only 30s to get final mAP. So i switched to this tool now
@titikid Yes, I will implement it then.
Thanks for noticing that, could you please tell me the output of:
python -m cProfile main.py -na -np
This way I will know what to improve.
@titikid I just added the difficult feature. Let me know if it works for you!
@Cartucho Great! i will try it and feedback to you
@Cartucho this is the output of python -m cProfile main.py -na -np
22.73% = backpack AP
85.94% = bed AP
17.52% = book AP
14.29% = bookcase AP
23.48% = bottle AP
31.86% = bowl AP
7.93% = cabinetry AP
53.84% = chair AP
4.55% = coffeetable AP
19.05% = countertop AP
42.50% = cup AP
39.66% = diningtable AP
0.00% = doll AP
20.69% = door AP
7.69% = heater AP
71.43% = nightstand AP
42.86% = person AP
17.71% = pictureframe AP
13.01% = pillow AP
62.31% = pottedplant AP
73.21% = remote AP
0.00% = shelf AP
16.33% = sink AP
90.48% = sofa AP
1.39% = tap AP
0.00% = tincan AP
63.25% = tvmonitor AP
18.75% = vase AP
45.45% = wastecontainer AP
23.53% = windowblind AP
mAP = 31.05% 174450 function calls (174350 primitive calls) in 0.118 seconds
Ordered by: standard name
ncalls tottime percall cumtime percall filename:lineno(function)
3 0.000 0.000 0.000 0.000 UserDict.py:103(contains)
9 0.000 0.000 0.000 0.000 UserDict.py:35(getitem)
3 0.000 0.000 0.000 0.000 UserDict.py:91(get)
115 0.004 0.000 0.021 0.000 init.py:122(dump)
267 0.000 0.000 0.004 0.000 init.py:193(dumps)
480 0.001 0.000 0.013 0.000 init.py:258(load)
480 0.000 0.000 0.011 0.000 init.py:294(loads)
1 0.000 0.000 0.002 0.002 init.py:99(
@titikid Thank you, I believe what's making it slow is writing up temporary files for computation, I believe if I load things into memory instead of files it should be faster.
When testing mAP with the VOC2007 data set, should the bbox marked as "difficult" be removed?
Oh right... the script that is converting the xml
to our format is not currently checking whether the detections are tagged as difficult or not. I can add that.
Did you use this script?
Thanks, I didn't notice your script, and wrote one myself. After removing the bbox marked as "difficult", the mAP on voc2007 was upgraded from 81.5 to 85. https://github.com/Stinky-Tofu/YOLO_V3
I have also recently seen comparison of AP on different object sizes (AP_small, AP_medium, AP_large). I think some people might find it useful and it would not be hard to implement in my opinion. E.g. on page #3 in YOLOv3 paper: https://pjreddie.com/media/files/papers/YOLOv3.pdf.
Hello @kocica that sounds like a great idea! However how do you define which objects are small
, medium
and large
in the image? Is there any standard rule?
We could also cluster the objects given their area. And we would know that we want to obtain 3 clusters.
Hi @Cartucho , i just checked the lastest code, it's good, even for the speed. I added function to script "convert_gt_xml.py" to convert VOC's ground truth files to your format with "difficult" label. I will make a PR for that. Another request, can you update the calculation for VOC2007 metric? ( your code use VOC2012 metric and sometimes it's not enough for benchmark)
Hi @Cartucho As report here https://github.com/pjreddie/darknet/issues/956 Can we modify our tool to use VOC 2007 metric (Reject difficult objects)?