Tencent / PocketFlow

An Automatic Model Compression (AutoMC) framework for developing smaller and faster AI applications.
https://pocketflow.github.io
Other
2.78k stars 490 forks source link

Does support prune, sparse, quantize stage-wisely for one model? #182

Closed ysh329 closed 5 years ago

ysh329 commented 5 years ago

Hi, all,

I found some test cases in our docs but they're all using one method, such as punning, weight sparsification, weight quantization.

Each test case uses one compression method but does support stage-wise do compression(punning, weight sparsification, weight quantization) for one model?


We found if using one compression method such as pruning, many pieces file about mode left. Can we using left model files as input of next compression stage? If yes, have any cases or detailed information?

Thanks in advance.

jiaxiang-wu commented 5 years ago

For now, no. Each learner will introduce some extra TensorFlow OPs into the graph, and it is hard to remove these OPs in the ckpt files (but for .pb & .tflite models, these OPs can be removed). This will cause trouble for the subsequent learner to be used. A possible approach for combining channel-pruning and other learners (e.g. weight sparsification, quantization) is to manually create a new model with fewer channels, feed in model weights in the channel-pruned model, and use it for other learners.

ysh329 commented 5 years ago

Thanks. 🙇