zhreshold / mxnet-ssd

MXNet port of SSD: Single Shot MultiBox Object Detector. Reimplementation of https://github.com/weiliu89/caffe/tree/ssd
MIT License
764 stars 337 forks source link

Mobilenet slower than vgg16? #202

Open mfiore opened 6 years ago

mfiore commented 6 years ago

Hi,

I have trained three ssd models, using vgg16_reduced, mobilenet_512 and mobilenet_608. After that I am running inference on a video, using batches of a single frame and comparing the speed of the three models on a PC with a i7 7700k and a GeForce 1080 ti.

I was surprised by the results: vgg16: 70 fps mobilenet_608: 45 fps mobilenet_512: 60 fps

I'm measuring the time by using the opencv function getTickCount() before and after the forward operation.

I was expecting mobilenet to be a lot faster. Any ideas about that? I have read that cudnn has problems optimizing depthwise convolutions. Is this related to that?

I trained all three on my dataset (which has a single class). The two mobilenet where trained from the pretrained models using --finetune,--network mobilenet and --data-shape 512 (or 608). If i got it right, with vgg16 you don't need to use --finetune, and can simply use the starting epoch of the model.

zhreshold commented 6 years ago

Mobilenet was originally very slow on GPU, last year the depthwise convolution op has been optimized. But I am not entirely sure if CUDNN is involved in this operation.