Dan8991 / SMCEPR_pytorch

Implementation of the paper Scalable Model Compression by Entropy Penalized Reparametrization in pytorch
3 stars 0 forks source link

SMCEPR_pytorch

Implementation of the paper Scalable Model Compression by Entropy Penalized Reparametrization in pytorch.

Installation

To setup the environment to just run the code locally you can run

conda env create -f environment.yml

otherwise if you want to import the code as an external package you need to run

pip install git+https://github.com/Dan8991/SMCEPR_pytorch

Or if you want a specific version run

pip install git+https://github.com/Dan8991/SMCEPR_pytorch@{version}

where version is the name of one of the version tags e.g. v0.1.0

if you are in a conda environment you also need to install pip beforehand with

conda install pip

Running code

To test the code you can run

python main.py

this will train the model you are interested on the mnist dataset, currently supported models are the LeNet fully connected model and the CafeLeNet model that is convolutional you have both LeNet and EntropyLeNet as well as CadeLeNet and EntropyCafeLeNet to check the performance difference between the entropy and the normal version of the model to change the training tradeoff you can change the lambda_RD parameter, the higher the lambda the lower the final rate of the model.

Main functions

First of all it is important to introduce the classes for the parameters decoders that allow to transform the parameters from the quantized representation into a proper representation for linear and convolutional layers. There are two classes that are used in this case i.e. the AffineDecoder and the ConvDecoder. The former can be used to encode weights from linear layers and biases, while the latter for weights used in convolutional layers. To import them use:

from smcper.parameter_decoders import AffineDecoder, ConvDecoder

Their main parameters for these classes are as follows

AffineDecoder(l)

where

ConvDecoder(kernel_size)

where

There are two main entropy layers that can be used i.e. the EntropyLinear and the EntropyConv2d classes (the latter does not have all functionalities implemented yet, for example the representation in the frequency domain can't be used).

from smcper.entropy_layers import EntropyLinear, EntropyConv2d

Their main parameters for these classes are as follows

EntropyLinear(
    self,
    in_features,
    out_features,
    weight_decoder,
    bias_decoder=None,
    ema_decay=0.999
)

where

EntropyLinear(
    kernel_size,
    in_features,
    out_features,
    weight_decoder,
    padding=0,
    stride=1,
    bias_decoder=None,
    ema_decay=0.999
):

where

In general the forward function for Entropy layers will return both the output of the layer and the rate that can be optimized after weighting it with a lambda parameter. Some useful functions that can be called on these layers are:

def update(self, force=False)

this is required by compressai that is used to implement the compression part and is used to update the probability tables, should be run before evaluation.

def get_compressed_params_size(self)

returns the total exact compressed size of the layer, this can be used to understand how much the network will weight in total when transmitted