Tencent / PocketFlow

An Automatic Model Compression (AutoMC) framework for developing smaller and faster AI applications.
https://pocketflow.github.io
Other
2.78k stars 490 forks source link

Why is the model after weigh_sparse larger than the original model? #276

Open jimengying opened 5 years ago

jimengying commented 5 years ago

使用--learner weight-sparse \ --ws_prune_ratio 0.5 压缩以后的模型比原始模型还要大

yuanyuanli85 commented 5 years ago

Weight-sparse Learner makes the masked weights to zero, not just removing them from weights. Since tensorflow does not support the compressed sparsed model natively, I think there is no need for pocketflow to compress them.

jiaxiang-wu commented 5 years ago

1. Why is the compressed model larger than the original one? Zero-valued weights are not removed in the compressed model, and we also need (at least for training) 0/1-valued mask tensors (one per prunable weight tensor, of the same size) to record which weights should be masked out.

2. Inference with the compressed model using weight sparsification. As mentioned by @yuanyuanli85, TensorFlow (and many other DL frameworks) does not inference speed-up with weight-sparsified models. Therefore, no actual speed-up can be observed. The only benefit is that you can try to store model weights as sparse tensors to save disk space.

3. Why does PocketFlow still support compression with weight sparisification? This is more like an experimental purposed component. Users may keep improving the performance of weight sparsification algorithm and when the sparsity ratio is really high (like >90%) with no accuracy degradation, the support for faster inference with weight sparsified models may arise.