Implementation of the paper Scalable Model Compression by Entropy Penalized Reparametrization in pytorch.
To setup the environment to just run the code locally you can run
conda env create -f environment.yml
otherwise if you want to import the code as an external package you need to run
pip install git+https://github.com/Dan8991/SMCEPR_pytorch
Or if you want a specific version run
pip install git+https://github.com/Dan8991/SMCEPR_pytorch@{version}
where version is the name of one of the version tags e.g. v0.1.0
if you are in a conda environment you also need to install pip beforehand with
conda install pip
To test the code you can run
python main.py
this will train the model you are interested on the mnist dataset, currently supported models are the LeNet fully connected model and the CafeLeNet model that is convolutional you have both LeNet and EntropyLeNet as well as CadeLeNet and EntropyCafeLeNet to check the performance difference between the entropy and the normal version of the model to change the training tradeoff you can change the lambda_RD parameter, the higher the lambda the lower the final rate of the model.
First of all it is important to introduce the classes for the parameters decoders that allow to transform the parameters from the quantized representation into a proper representation for linear and convolutional layers. There are two classes that are used in this case i.e. the AffineDecoder and the ConvDecoder. The former can be used to encode weights from linear layers and biases, while the latter for weights used in convolutional layers. To import them use:
from smcper.parameter_decoders import AffineDecoder, ConvDecoder
Their main parameters for these classes are as follows
AffineDecoder(l)
where
ConvDecoder(kernel_size)
where
There are two main entropy layers that can be used i.e. the EntropyLinear and the EntropyConv2d classes (the latter does not have all functionalities implemented yet, for example the representation in the frequency domain can't be used).
from smcper.entropy_layers import EntropyLinear, EntropyConv2d
Their main parameters for these classes are as follows
EntropyLinear(
self,
in_features,
out_features,
weight_decoder,
bias_decoder=None,
ema_decay=0.999
)
where
EntropyLinear(
kernel_size,
in_features,
out_features,
weight_decoder,
padding=0,
stride=1,
bias_decoder=None,
ema_decay=0.999
):
where
In general the forward function for Entropy layers will return both the output of the layer and the rate that can be optimized after weighting it with a lambda parameter. Some useful functions that can be called on these layers are:
def update(self, force=False)
this is required by compressai that is used to implement the compression part and is used to update the probability tables, should be run before evaluation.
def get_compressed_params_size(self)
returns the total exact compressed size of the layer, this can be used to understand how much the network will weight in total when transmitted