Brevitas is a PyTorch library for neural network quantization, with support for both post-training quantization (PTQ) and quantization-aware training (QAT).
Please note that Brevitas is a research project and not an official Xilinx product.
If you like this project please consider ⭐ this repo, as it is the simplest and best way to support it.
You can install the latest release from PyPI:
pip install brevitas
Brevitas currently offers quantized implementations of the most common PyTorch layers used in DNN under brevitas.nn
, such as QuantConv1d
, QuantConv2d
, QuantConvTranspose1d
, QuantConvTranspose2d
, QuantMultiheadAttention
, QuantRNN
, QuantLSTM
etc., for adoption within PTQ and/or QAT.
For each one of these layers, quantization of different tensors (inputs, weights, bias, outputs, etc) can be individually tuned according to a wide range of quantization settings.
As a reference for PTQ, Brevitas provides an example user flow for ImageNet classification models under brevitas_examples.imagenet_classification.ptq
that quantizes an input torchvision model using PTQ under different quantization configurations (e.g. bit-width, granularity of scale, etc).
For more info, checkout our getting started guide.
If you adopt Brevitas in your work, please cite it as:
@software{brevitas,
author = {Alessandro Pappalardo},
title = {Xilinx/brevitas},
year = {2023},
publisher = {Zenodo},
doi = {10.5281/zenodo.3333552},
url = {https://doi.org/10.5281/zenodo.3333552}
}