Have highly optimized GEMM?

ysh329 commented 7 years ago

The content below is from orginal paper:

Unstructured sparse matrix operations are not typically faster than dense matrix operations until a very high level of sparsity. 

Our model structure puts nearly all of the computation into dense 1×1 convolutions. This can be implemented with highly optimized general matrix multiply (GEMM) functions. Often convolutions are implemented by a GEMM but require an initial reordering in memory called im2col in order to map it to a GEMM. For instance, this approach is used in the popular Caffe package [15].

1×1 convolutions do not require this reordering in memory and can be implemented directly with GEMM which is one of the most optimized numerical linear algebra algorithms.
MobileNet spends 95% of it’s computation time in 1 × 1 convolutions which also has 75% of the parameters.

So I wann ask, does this repo accomplish with highly optimized GEMM? Besides, I wann ask: does this need im2col during mobilenet? before normal and depth-wise conv Thanks!

KeyKy commented 7 years ago

@ysh329 does this repo accomplish with highly optimized GEMM? Nope.

ysh329 commented 7 years ago

@KeyKy Thanks!

KeyKy / mobilenet-mxnet

Have highly optimized GEMM? #2