Regarding performance comparison

mathmanu / caffe-jacinto-models

This repository has moved. The new link can be obtained from https://github.com/TexasInstruments/jacinto-ai-devkit

63 stars 47 forks source link

Regarding performance comparison #8

Closed aashish-kumar closed 5 years ago

aashish-kumar commented 5 years ago

Is there any performance comparison done for Object Detection for following two models: jdetnet21 vs mobilenet I see jdetnet21 is loosely based on mobilenet. Are there any particular benefits of using one vs other.

mathmanu commented 5 years ago

The definition of jdetnet21 is here: https://github.com/tidsp/caffe-jacinto-models/blob/caffe-0.17/scripts/models/jacintonet_v2.py#L205

I don't see a connection. Can you specifically tell me why do you think it is "loosely" based on mobilenet?

aashish-kumar commented 5 years ago

both are OD models mobilenet has a cascaded ( 1x1 conv followed by 3x3 dw conv) jdetnet21 has a cascaded (3x3 conv followed by 3x3 dw conv) typically for GEMM libraries 1x1 conv is quite quick making mobilenet suitable for small device(from mobilenet paper) but I am seeing jdetnet not using 1x1 conv

mathmanu commented 5 years ago

I understand now. I think you are comparing to mobiledetnet defined here (and not to mobilenet) https://github.com/tidsp/caffe-jacinto-models/blob/caffe-0.17/scripts/models/mobilenet.py#L172

One correction: jdetnet21 has a cascaded 3x3 conv followed by 3x3 grouped convolution

Both of them work well. You can choose any one of them.

Notice that we have mostly used mobiledetnet-0.5 (i.e. half the number of channels) version of mobiledetnet by default. This model is quite small - so it should run fast without need for sparsification.
Another option is to use jdetnet21 and sparsify the model.

aashish-kumar commented 5 years ago

I tried to benchmark these two model on GPU GTX1080 TI mobiledetnet-0.5 - 17 fps jdetnet(without sparsify) - 100 fps The performance difference is quite a lot. I am planning to run them on TDA2px EVM board.

mathmanu commented 5 years ago

Do let us know your result.

As I have written in the Acknowledgements section of this page: https://github.com/tidsp/caffe-jacinto

the implementation of Depthwise convolution in caffe is not at all optimal. I have observed almost 4x speedup for mobilenet-0.5 by using the alternate implementation that is mentioned there.