VDIGPKU / M2Det

M2Det: A Single-Shot Object Detector based on Multi-Level Feature Pyramid Network
MIT License
1.45k stars 318 forks source link

The inference speed is low with FPS=9.53 on Titan with the m2det512_vgg.pth. #14

Open convolutionROC opened 5 years ago

convolutionROC commented 5 years ago

hi, I download the model m2det512_vgg.pth and test it. The mAP is 37.8%. However, the FPS is only 9.53 on Titan and 10.52 on V100, which is much lower than FPS=18 in the paper. Is there any trick?

qijiezhao commented 5 years ago

First, thanks for reproducing the accuracy. Then, the speed. I guess the reasons: 1, Maybe the total time includes the image I/O time. 2, No Synchronize to speed up? It's automatic for PyTorch

yukkyo commented 5 years ago

Hi @qijiezhao !

On my environment(Titan V), I got mAP = 37.8% and Fps = 11.5. (Nice mAP😆)

But Fps is lower than paper's(Fps = 18). And checking with -m torch.utils.bottleneck option, .cpu() looks like little slow.

You said 1, Maybe the total time includes the image I/O time., but in test.py, _t['im_detect'].tic() is after img = testset.pull_image(i). So It seems to me that the influence of image I/O time is not critical.

Are there any other points to slow down?

1. Result of reproducing

$ python test.py -c=configs/m2det512_vgg.py -m=weights/m2det512_vgg.pth
----------------------------------------------------------------------
|                       M2Det Evaluation Program                     |
----------------------------------------------------------------------
The Anchor info:
{'feature_maps': [64, 32, 16, 8, 4, 2], 'min_dim': 512, 'steps': [8, 16, 32, 64, 128, 256], 'min_sizes': [30.72, 76.8, 168.96, 261.12, 353.28, 445.44], 'max_sizes': [76.8, 168.96, 261.12, 353.28, 445.44, 537.6], 'aspect_ratios': [[2, 3], [2, 3], [2, 3], [2, 3], [2, 3], [2, 3]], 'variance': [0.1, 0.2], 'clip': True}
===> Constructing M2Det model
Loading resume network...
===> Finished constructing and loading model
loading annotations into memory...
Done (t=0.33s)
creating index...
index created!
minival2014 gt roidb loaded from /home/fujimoto/data/coco_cache/minival2014_gt_roidb.pkl
=> Total 5000 images to test.
Begin to evaluate
100%|#######################################################################################################################################################################################################################################| 5000/5000 [07:04<00:00, 12.10it/s]
===> Evaluating detections
Collecting Results......
Writing results json to eval/COCO/detections_minival2014_results.json
Loading and preparing results...
DONE (t=0.91s)
creating index...
index created!
Running per image evaluation...
useSegm (deprecated) is not None. Running bbox evaluation
Evaluate annotation type *bbox*
DONE (t=27.31s).
Accumulating evaluation results...
DONE (t=3.17s).
~~~~ Mean and per-category AP @ IoU=[0.50,0.95] ~~~~
37.8
~~~~ Summary metrics ~~~~
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.378
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.560
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.409
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.194
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.431
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.539
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.303
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.483
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.511
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.262
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.577
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.700
Wrote COCO eval results to: eval/COCO/detection_results.pkl
Detect time per image: 0.077s
Nms time per image: 0.010s
Total time per image: 0.087s
FPS: 11.507 fps

Result of do test.py with -m torch.utils.bottleneck

--------------------------------------------------------------------------------
  Environment Summary
--------------------------------------------------------------------------------
PyTorch 0.4.1 compiled w/ CUDA 9.0.176
Running with Python 3.6 and CUDA 9.2.148

`pip list` truncated output:
Unable to fetch
--------------------------------------------------------------------------------
  cProfile output
--------------------------------------------------------------------------------
         1106068 function calls (1052862 primitive calls) in 11.146 seconds

   Ordered by: internal time
   List reduced from 2636 to 15 due to restriction <15>

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
     8580    3.903    0.000    3.903    0.000 {built-in method conv2d}
        1    2.158    2.158    2.161    2.161 /home/yukkyo/work/layers/functions/prior_box.py:33(forward)
      104    2.116    0.020    2.116    0.020 {method 'cpu' of 'torch._C._TensorBase' objects}
yukkyo commented 5 years ago

Additional, I changed PyTorch 0.4.1 to 1.0.1.post2, Fps is changed from 11.5 to 15.3.

$ python test.py -c=configs/m2det512_vgg.py -m=weights/m2det512_vgg.pth
...
~~~~ Mean and per-category AP @ IoU=[0.50,0.95] ~~~~
37.8
~~~~ Summary metrics ~~~~
...
Detect time per image: 0.056s
Nms time per image: 0.010s
Total time per image: 0.065s
FPS: 15.333 fps
moyans commented 5 years ago

on my custom dataset(set softnms False), 312*312 only 15fps titan x, cuda9 ,cudnn7, torch 1.0.1.post2

The Anchor info: {'feature_maps': [40, 20, 10, 5, 3, 1], 'min_dim': 320, 'steps': [8, 16, 32, 64, 107, 320], 'min_sizes': [25.6, 48.0, 105.6, 163.2, 220.8, 278.4], 'max_sizes': [48.0, 105.6, 163.2, 220.8, 278.4, 336.0], 'aspect_ratios': [[2, 3], [2, 3], [2, 3], [2, 3], [2, 3], [2, 3]], 'variance': [0.1, 0.2], 'clip': True} ===> Constructing M2Det model Loading resume network... ===> Finished constructing and loading model loading annotations into memory... Done (t=0.27s) creating index... index created! => Total 779 images to test. Begin to evaluate 100%|█████████████████████████████████████████████| 779/779 [01:30<00:00, 6.86it/s] ===> Evaluating detections Loading and preparing results... DONE (t=0.26s) creating index... index created! Running per image evaluation... useSegm (deprecated) is not None. Running bbox evaluation ... Detect time per image: 0.064s Nms time per image: 0.001s Total time per image: 0.065s FPS: 15.445 fps

devendraswamy commented 4 years ago

File "test.py", line 114, in thresh = cfg.test_cfg.score_threshold) File "test.py", line 84, in test_net testset.evaluate_detections(all_boxes, save_folder) AttributeError: 'CustomDataset' object has no attribute 'evaluate_detections'

Could you please help me to solve the tat error , Thanking you !