Closed sjhan91 closed 5 years ago
It is difficult to achieve actual speed-up with sparse matrix multiplication, unless the ratio of non-zero entries is sufficiently small. Since TensorFlow Lite does not support sparse matrix muplication, we also do not provide the corresponding model conversion script (from .ckpt model to .pb model). Currently, we only support training with sparsity constraint and this will not bring actual speed-up during inference.
Hi, I'm working with pruning in the level of neurons and channels.
Most of Tensorflow codes from Github describes pruning as making 0 unimportant weights or channels. However, just making them 0 cannot make the model size smaller or accelerate inference speed.
In your project, did you solve this problem by supporting sparse matrix operation for weight sparsification or Tensorflow graph surgery for channel pruning?
I read your code fast, but I can found only weights masked or channel[:, :, idx, :] = 0 syntax.
Thanks.