Tencent / PocketFlow

An Automatic Model Compression (AutoMC) framework for developing smaller and faster AI applications.
https://pocketflow.github.io
Other
2.78k stars 490 forks source link

Are you supporting sparse matrix operations or graph surgery? #261

Closed sjhan91 closed 5 years ago

sjhan91 commented 5 years ago

Hi, I'm working with pruning in the level of neurons and channels.

Most of Tensorflow codes from Github describes pruning as making 0 unimportant weights or channels. However, just making them 0 cannot make the model size smaller or accelerate inference speed.

In your project, did you solve this problem by supporting sparse matrix operation for weight sparsification or Tensorflow graph surgery for channel pruning?

I read your code fast, but I can found only weights masked or channel[:, :, idx, :] = 0 syntax.

Thanks.

jiaxiang-wu commented 5 years ago

It is difficult to achieve actual speed-up with sparse matrix multiplication, unless the ratio of non-zero entries is sufficiently small. Since TensorFlow Lite does not support sparse matrix muplication, we also do not provide the corresponding model conversion script (from .ckpt model to .pb model). Currently, we only support training with sparsity constraint and this will not bring actual speed-up during inference.