turboderp / exllamav2

A fast inference library for running LLMs locally on modern consumer-class GPUs
MIT License
3.56k stars 273 forks source link

Is there a colab or something which shows all the code necessary to setup the project? #175

Closed slooi closed 10 months ago

slooi commented 10 months ago

Title. I've never set something like this up before on my ubuntu server. Am I doing something wrong?

git clone https://github.com/turboderp/exllamav2
cd exllamav2
sed -i "s/torch>=2.1.0/torch==2.0.0/g" requirements.txt
pip install -r requirements.txt
git clone https://huggingface.co/LoneStriker/Thespis-Mistral-7b-v0.5-3.0bpw-h6-exl2
python examples/chat.py -m "LoneStriker/Thespis-Mistral-7b-v0.5-3.0bpw-h6-exl2" -mode llama

The above produces the below error:


Unpacking objects: 100% (14/14), 4.23 KiB | 1.06 MiB/s, done.
Traceback (most recent call last):
  File "/kaggle/working/exllamav2/exllamav2/ext.py", line 14, in <module>
    import exllamav2_ext
ModuleNotFoundError: No module named 'exllamav2_ext'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/kaggle/working/exllamav2/examples/chat.py", line 5, in <module>
    from exllamav2 import(
  File "/kaggle/working/exllamav2/exllamav2/__init__.py", line 3, in <module>
    from exllamav2.model import ExLlamaV2
  File "/kaggle/working/exllamav2/exllamav2/model.py", line 17, in <module>
    from exllamav2.cache import ExLlamaV2CacheBase
  File "/kaggle/working/exllamav2/exllamav2/cache.py", line 2, in <module>
    from exllamav2.ext import exllamav2_ext as ext_c
  File "/kaggle/working/exllamav2/exllamav2/ext.py", line 126, in <module>
    exllamav2_ext = load \
  File "/opt/conda/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1284, in load
    return _jit_compile(
  File "/opt/conda/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1509, in _jit_compile
    _write_ninja_file_and_build_library(
  File "/opt/conda/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1601, in _write_ninja_file_and_build_library
    extra_ldflags = _prepare_ldflags(
  File "/opt/conda/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1699, in _prepare_ldflags
    extra_ldflags.append(f'-L{_join_cuda_home("lib64")}')
  File "/opt/conda/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 2223, in _join_cuda_home
    raise EnvironmentError('CUDA_HOME environment variable is not set. '
OSError: CUDA_HOME environment variable is not set. Please set it to your CUDA install root.
turboderp commented 10 months ago

I added a Colab notebook to the examples here. Maybe that helps.

Locally, you'll want to make sure you have CUDA installed, along with the corresponding version of PyTorch.