PyTorch implementation of 'Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding' by Song Han, Huizi Mao, William J. Dally
This implementation implements three core methods in the paper - Deep Compression
Following packages are required for this project
or just use docker
$ docker pull tonyapplekim/deepcompressionpytorch
$ python pruning.py
This command
You can control other values such as
python pruning.py --help
$ python weight_share.py saves/model_after_retraining.ptmodel
This command
saves/model_after_weight_sharing.ptmodel
$ python huffman_encode.py saves/model_after_weight_sharing.ptmodel
This command
encodings/
folderNote that I didn’t apply pruning nor weight sharing nor Huffman coding for bias values. Maybe it’s better if I apply those to the biases as well, I haven’t try this out yet.
Note that this work was done when I was employed at http://nota.ai