Quantization-Tutorials

A bunch of coding tutorials for my Youtube videos on Neural Network Quantization.

Resnet-Eager-Mode-Quant:

This is the first coding tutorial. We take the torchvision ResNet model and quantize it entirely from scratch with the PyTorch quantization library, using Eager Mode Quantization.

We discuss common issues one can run into, as well as some interesting but tricky bugs.

Resnet-Eager-Mode-Dynamic-Quant:

TODO

In this tutorial, we do dynamic quantization on a ResNet model. We look at how dynamic quantization works, what the default settings are in PyTorch, and discuss how it differs to static quantization.

How to do FX Graph Mode Quantization (PyTorch ResNet Coding tutorial)

In this tutorial series, we use Torch's FX Graph mode quantization to quantize a ResNet. In the first video, we look at the Directed Acyclic Graph (DAG), and see how the fusing, placement of quantstubs and FloatFunctionals all happen automatically. In the second, we look at some of the intricacies of how quantization interacts with the GraphModule. In the third and final video, we look at some more advanced techniques for manipulating and traversing the graph, and use these to discover an alternative to forward hooks, and for fusing BatchNorm layers into their preceding Convs.

Quantization Aware Training

In this tutorial we look at how to do Quantization Aware Training (QAT) on an FX Graph Mode quantized Resnet. We build a small trianing lopp with a mini custom data loader. We also generalise the evaluate function we've been using in our tutorials to generalise to other images. We go looking for and find some of the danges of overfit.

Cross Layer Equalization (CLE)

In this tutorial, we look at Cross-Layer Equalization, a classic data-free method for improving the quantization of one's models. We use a graph-tracing method to find all of the layers we can do CLE on, do CLE, evaluate the results, and then visualize what's happening inside the model.