jy-yuan / KIVI

KIVI: A Tuning-Free Asymmetric 2bit Quantization for KV Cache
https://arxiv.org/abs/2402.02750
MIT License
142 stars 10 forks source link

CUDA version #16

Closed hensiesp32 closed 2 weeks ago

hensiesp32 commented 2 weeks ago

Thanks for your wonderful work. I have a question about the cuda version, does is only support cuda 12.*? My computer is cuda 11.8, but i met some trouble with the conda enviroment. wish your reply.

zirui-ray-liu commented 2 weeks ago

Thank you for your interest! We previously tested with > cuda 12.0. Can you copy paste the error message with cuda 11.8? I will double check.

hensiesp32 commented 2 weeks ago

Thank you for your interest! We previously tested with > cuda 12.0. Can you copy paste the error message with cuda 11.8? I will double check.

Sure. When I executed this command ”cd quant && pip install -e .“ , the following error occurred image

And my computer cuda setting is image

And the pip list :

accelerate               0.25.0
asttokens                2.4.1
attributedict            0.3.0
blessings                1.7
cachetools               5.3.3
certifi                  2024.2.2
chardet                  5.2.0
charset-normalizer       3.3.2
codecov                  2.1.13
colorama                 0.4.6
coloredlogs              15.0.1
colour-runner            0.1.1
coverage                 7.5.1
decorator                5.1.1
deepdiff                 7.0.1
distlib                  0.3.8
exceptiongroup           1.2.1
executing                2.0.1
fastchat                 0.1.0
filelock                 3.14.0
fsspec                   2024.3.1
huggingface-hub          0.23.0
humanfriendly            10.0
idna                     3.7
inspecta                 0.1.3
ipdb                     0.13.13
ipython                  8.24.0
jedi                     0.19.1
Jinja2                   3.1.4
kivi                     0.1.0       
MarkupSafe               2.1.5
matplotlib-inline        0.1.7
mpmath                   1.3.0
networkx                 3.3
numpy                    1.26.4
nvidia-cublas-cu12       12.1.3.1
nvidia-cuda-cupti-cu12   12.1.105
nvidia-cuda-nvrtc-cu12   12.1.105
nvidia-cuda-runtime-cu12 12.1.105
nvidia-cudnn-cu12        8.9.2.26
nvidia-cufft-cu12        11.0.2.54
nvidia-curand-cu12       10.3.2.106
nvidia-cusolver-cu12     11.4.5.107
nvidia-cusparse-cu12     12.1.0.106
nvidia-nccl-cu12         2.18.1
nvidia-nvjitlink-cu12    12.4.127
nvidia-nvtx-cu12         12.1.105
ordered-set              4.1.0
packaging                24.0
parso                    0.8.4
pexpect                  4.9.0
pillow                   10.2.0
pip                      24.0
platformdirs             4.2.1
pluggy                   1.5.0
prompt-toolkit           3.0.43
protobuf                 5.26.1
psutil                   5.9.8
ptyprocess               0.7.0
pure-eval                0.2.2
Pygments                 2.18.0
pyproject-api            1.6.1
PyYAML                   6.0.1
regex                    2024.5.10
requests                 2.31.0
rootpath                 0.1.1
safetensors              0.4.3
sentencepiece            0.2.0
setuptools               69.5.1
six                      1.16.0
stack-data               0.6.3
sympy                    1.12
termcolor                2.4.0
tokenizers               0.15.2
toml                     0.10.2
tomli                    2.0.1
torch                    2.1.2
torchaudio               2.1.2+cu118
torchvision              0.16.2+cu118
tox                      4.15.0
tqdm                     4.66.4
traitlets                5.14.3
transformers             4.36.2
triton                   2.1.0
typing_extensions        4.11.0
urllib3                  2.2.1
virtualenv               20.26.2
wcwidth                  0.2.13
wheel                    0.43.0
hensiesp32 commented 2 weeks ago

Thank you for your interest! We previously tested with > cuda 12.0. Can you copy paste the error message with cuda 11.8? I will double check.

One of my questions is which version of the following tool libraries associated with CUDA should i install to match the cuda11.8

nvidia-cublas-cu12       12.1.3.1
nvidia-cuda-cupti-cu12   12.1.105
nvidia-cuda-nvrtc-cu12   12.1.105
nvidia-cuda-runtime-cu12 12.1.105
nvidia-cudnn-cu12        8.9.2.26
nvidia-cufft-cu12        11.0.2.54
nvidia-curand-cu12       10.3.2.106
nvidia-cusolver-cu12     11.4.5.107
nvidia-cusparse-cu12     12.1.0.106
nvidia-nccl-cu12         2.18.1
nvidia-nvjitlink-cu12    12.4.127
nvidia-nvtx-cu12         12.1.105
henryzhongsc commented 2 weeks ago

Unfortunately, this is a question we cannot answer with confidence because our server is currently running on CUDA 12+, and it is non-trivial to downgrade our driver to figure out a CUDA 11.8-compatible environment that supports KIVI.

We recommend picking and installing a cu118-compatible torch release, installing the transformers lib as stipulated, then running the code and resolving each dependency error one by one without adjusting torch/cuda version further. Likely none of the cuda-related dependencies you pasted above is compatible with CUDA 11.8, as they all have cu12 in their names. Your best shot is to seek out an alternative cu118-based environment that supports KIVI.

Hope this helps!