HAWQ is an advanced quantization library written for PyTorch. HAWQ enables low-precision and mixed-precision uniform quantization, with direct hardware implementation through TVM.
For more details please see:
git clone https://github.com/Zhen-Dong/HAWQ.git
cd HAWQ
pip install -r requirements.txt
An example to run uniform 8-bit quantization for resnet50 on ImageNet.
export CUDA_VISIBLE_DEVICES=0
python quant_train.py -a resnet50 --epochs 1 --lr 0.0001 --batch-size 128 --data /path/to/imagenet/ --pretrained --save-path /path/to/checkpoints/ --act-range-momentum=0.99 --wd 1e-4 --data-percentage 0.0001 --fix-BN --checkpoint-iter -1 --quant-scheme uniform8
The commands for other quantization schemes and for other networks are shown in the model zoo.
Table I and Table II in HAWQ-V3: Dyadic Neural Network Quantization
Model | Quantization | Model Size(MB) | BOPS(G) | Accuracy(%) | Inference Speed (batch=8, ms) | Download |
---|---|---|---|---|---|---|
ResNet18 |
Floating Points | 44.6 | 1858 | 71.47 | 9.7 (1.0x) | resnet18_baseline |
ResNet18 |
W8A8 | 11.1 | 116 | 71.56 | 3.3 (3.0x) | resnet18_uniform8 |
ResNet18 |
Mixed Precision | 6.7 | 72 | 70.22 | 2.7 (3.6x) | resnet18_bops0.5 |
ResNet18 |
W4A4 | 5.8 | 34 | 68.45 | 2.2 (4.4x) | resnet18_uniform4 |
Model | Quantization | Model Size(MB) | BOPS(G) | Accuracy(%) | Inference Speed (batch=8, ms) | Download |
---|---|---|---|---|---|---|
ResNet50 |
Floating Points | 97.8 | 3951 | 77.72 | 26.2 (1.0x) | resnet50_baseline |
ResNet50 |
W8A8 | 24.5 | 247 | 77.58 | 8.5 (3.1x) | resnet50_uniform8 |
ResNet50 |
Mixed Precision | 18.7 | 154 | 75.39 | 6.9 (3.8x) | resnet50_bops0.5 |
ResNet50 |
W4A4 | 13.1 | 67 | 74.24 | 5.8 (4.5x) | resnet50_uniform4 |
More results for different quantization schemes and different models (also the corresponding commands and important notes) are available in the model zoo. \ To download the quantized models through wget, please refer to a simple command in model zoo. \ Checkpoints in model zoo are saved in floating point precision. To shrink the memory size, BitPack can be applied on weight_integer tensors, or directly on quantized_checkpoint.pth.tar file.
THIS SOFTWARE WAS DEPOSITED IN THE BAIR OPEN RESEARCH COMMONS REPOSITORY ON FEB 1, 2023.
HAWQ is released under the MIT license.