microsoft / VPTQ

VPTQ, A Flexible and Extreme low-bit quantization algorithm
MIT License
485 stars 27 forks source link

Docker image for development #111

Open caronzh03 opened 1 day ago

caronzh03 commented 1 day ago

Hi, are there plans to create a docker image for development / testing, especially for the algorithm branch where one might have needs to quantize a customized LLM.

YangWang92 commented 22 hours ago

We currently do not have Docker images. If you are looking to develop using a Docker image, you should base it on NVIDIA Docker.

Currently, you can start by creating a conda environment using the algo-environment.yml, which should meet your needs. I may add a Dockerfile in the future, but that will take some time.

caronzh03 commented 6 minutes ago

Thanks @YangWang92 for the pointers. I followed your suggestion, and used Nvidia's docker image nvcr.io/nvidia/cuda:12.1.0-runtime-ubuntu20.04, but during the CUDA compilation step (TORCH_CUDA_ARCH_LIST=8.0 pip install -e . --no-build-isolation), I got the following error:

      In file included from /root/miniconda3/envs/vptq-algo/lib/python3.10/site-packages/torch/include/ATen/cuda/CUDAContext.h:3,
                       from /models/VPTQ/csrc/common.h:5,
                       from /models/VPTQ/csrc/ops.cc:5:
      /root/miniconda3/envs/vptq-algo/lib/python3.10/site-packages/torch/include/ATen/cuda/CUDAContextLight.h:6:10: fatal error: cuda_runtime_api.h: No such file or dire
ctory
          6 | #include <cuda_runtime_api.h>
            |          ^~~~~~~~~~~~~~~~~~~~
      compilation terminated.

After some googling, I'm guessing it's because CUDA installation cannot be found by the compiler. I've tried several things, including manually set the CUDA_HOME, etc, but still no luck in getting it working.

Did you develop on a Ubuntu box?