wenwei202 / caffe

Caffe for Sparse and Low-rank Deep Neural Networks
Other
377 stars 134 forks source link

@wenwei202 Can you give any suggestion about doing sparse computation in android platform? #16

Closed wenston2006 closed 6 years ago

wenston2006 commented 6 years ago

Have you heard about a CNN lib named ncnn? The author tries to speed up the CNN using assembly language optimization in android platform. Someone claims that img2col+gemm is slower than ncnn using arm chip. It is claimed that img2col operation cost much time in android platform. I asked the authors whether someone had tried doing sparse matrix computation optimization using ncnn. The author told me that they have not done that. Another guy told me that they are trying to improve the compuation speed using winograd further. However, it seems that caffe2 or Tensor flow are using img2col+gemm.

So can you give any suggestion about doing sparse computation in android platform?I found that just Eigen supported sparse computation in android platform. If eigen is using img2col+gemm, is it slower than those assembly language optimized lib such as ncnn?

wenwei202 commented 6 years ago

@wenston2006 I don't know ncnn. If you use SSL to remove rows and columns in weight matrices, sparse library is not necessary and the computation is still regular img2col+gemm.

kingofoz commented 6 years ago

Hi @wenwei202 Is it easy to port SSL to caffe2?

wenwei202 commented 6 years ago

@kingofoz c++/cuda kernels are available in caffe, you can use them to implant. Python implementation is also possible. We have a tensorflow implementation if SSL.

kingofoz commented 6 years ago

Hi @wenwei202 is the tensorflow implementation only for RNN/LSTM not CNN?

wenwei202 commented 6 years ago

It's going to be similar.

kingofoz commented 6 years ago

I see. Thanks @wenwei202

kingofoz commented 6 years ago

Hi @wenwei202 I have another question. Is SSL feasible to the network which are based on group convolution?

wenwei202 commented 6 years ago

@kingofoz I think so. SSL is a general solution, the key of which is to figure out the group. We have successfully extended it to RNNs and can aggressively reduce the hidden sizes. The implementation is here. I think it should be easy to extend it to group convolution.

kingofoz commented 6 years ago

Thanks! @wenwei202