Closed OMG59E closed 6 years ago
One image like the demo or on the test set?
run demo.py : [13:31:38] src/operator/./cudnn_convolution-inl.h:55: Running performance tests to find the best convolution algorithm, this can take a while... 3.25967097282 s class ---- [[x1, x2, y1, y2, confidence]] --------- bus --------- [[ 86.82266235 50.60977554 428.10751343 305.20828247 0.99838781]] results saved to data/demo/000456_result.jpg
run test.py testing 4941/4952 data 0.0267s net 0.1038s post 0.0024s testing 4942/4952 data 0.0296s net 0.1068s post 0.0023s testing 4943/4952 data 0.0292s net 0.1164s post 0.0019s testing 4944/4952 data 0.0293s net 0.1104s post 0.0023s testing 4945/4952 data 0.0271s net 0.1015s post 0.0014s testing 4946/4952 data 0.0267s net 0.1347s post 0.0010s testing 4947/4952 data 0.0252s net 0.1022s post 0.0019s testing 4948/4952 data 0.0257s net 0.1018s post 0.0018s testing 4949/4952 data 0.0299s net 0.1115s post 0.0021s testing 4950/4952 data 0.0265s net 0.1041s post 0.0032s testing 4951/4952 data 0.0279s net 0.1127s post 0.0032s
So basically cold start of cuda application is slow. Turn off cudnn autotune (see script) for demo.
just like ? export MXNET_CUDNN_AUTOTUNE_DEFAULT=0 export MXNET_ENABLE_GPU_P2P=0 export PYTHONUNBUFFERED=1
MXNET_ENABLE_GPU_P2P=0 will turn off gpu p2p communication as the name suggests so I think you can leave it.
Your test script is 140~150ms. I think in resnet paper they mentioned ~400ms for resnet-101 and py-faster-rcnn doesn't have resnet. Should have closed this......
I trained with mxnet-rcnn, but detect a image take ~3.3 s: Method : Faster R-CNN end-to-end Network : VGG16 Training Data : VOC07+12 Testing Data : VOC07test Result : 75.66 GPU : Titan X(maxwell) why ?