AutoAWQ Kernels is a new package that is split up from the main repository in order to avoid compilation times.
Windows: Must use WSL2.
NVIDIA:
AMD:
The package is available on PyPi with CUDA 12.4.1 wheels:
pip install autoawq-kernels
To build the kernels from source, you first need to setup an environment containing the necessary dependencies.
rocsparse-dev hipsparse-dev rocthrust-dev rocblas-dev hipblas-dev
.pip install git+https://github.com/casper-hansen/AutoAWQ_kernels.git
Notes on environment variables:
TORCH_VERSION
: By default, we build using the current version of torch by torch.__version__
. You can override it with TORCH_VERSION
.
CUDA_VERSION
or ROCM_VERSION
can also be used to build for a specific version of CUDA or ROCm.CC
and CXX
: You can specify which build system to use for the C code, e.g. CC=g++-13 CXX=g++-13 pip install -e .
COMPUTE_CAPABILITIES
: You can specify specific compute capabilities to compile for: COMPUTE_CAPABILITIES="75,80,86,87,89,90" pip install -e .