This repository contains supplementary code for the paper Automated Multi-Stage Compression of Neural Networks. It demonstrates how a neural network with convolutional and fully connected layers can be compressed using iterative tensor decomposition of weight tensors.
numpy
scipy
scikit-tensor-py3
absl-py
flopco-pytorch
tensorly==0.4.5
pytorch
pip install musco-pytorch
from torchvision.models import resnet50
from flopco import FlopCo
from musco.pytorch import CompressorVBMF, CompressorPR, CompressorManual
model = resnet50(pretrained = True)
model_stats = FlopCo(model, device = device)
compressor = CompressorVBMF(model,
model_stats,
ft_every=5,
nglobal_compress_iters=2)
while not compressor.done:
# Compress layers
compressor.compression_step()
# Fine-tune compressed model.
compressed_model = compressor.compressed_model
# Compressor decomposes 5 layers on each iteration.
# Compressed model is saved at compressor.compressed_model.
# You have to fine-tune model after each iteration to restore accuracy.
Please, find more examples in musco/pytorch/examples folder
You can compress the model using diffrenet strategies depending on rank selection method.
Using any of the below listed compressors, you can optionally specify:
ranks = {lname : None for lname in noncompressing_lnames}
)ft_every = 3
, i.e. compression schedule is as follows: compress 3 layers, fine-tine, compress another 3 layers, fine-tune, ... )nglobal_iters = 2
, by default 1)CompressorVBMF: ranks are determined by aglobal analytic solution of variational Bayesian matrix factorization (EVBMF)
vbmf_weakenen_factors = {lname : factor for lname in lnames}
)CompressorPR: ranks correspond to chosen fixed parameter reduction rate (specified for each layer, default: 2x for all layers)
conv2d_nn_decomposition = cp3
)param_reduction_rates
argument), can be different for each layerCompressorManual: manualy specified ranks are used
conv2d_nn_decomposition = tucker2
)ranks = {lname : rank for lname in lnames}
, if you don't want to compress layer set None
instead rank
value)If you used our research, we kindly ask you to cite the corresponding paper.
@inproceedings{gusak2019automated,
title={Automated Multi-Stage Compression of Neural Networks},
author={Gusak, Julia and Kholiavchenko, Maksym and Ponomarev, Evgeny and Markeeva, Larisa and Blagoveschensky, Philip and Cichocki, Andrzej and Oseledets, Ivan},
booktitle={Proceedings of the IEEE International Conference on Computer Vision Workshops},
pages={0--0},
year={2019}
}
Project is distributed under Apache License 2.0.