turboderp / exllama

A more memory-efficient rewrite of the HF transformers implementation of Llama for use with quantized weights.
MIT License
2.66k stars 214 forks source link

OSError: CUDA_HOME environment variable is not set. #291

Open jamesbraza opened 9 months ago

jamesbraza commented 9 months ago

I am getting this on my Mac M1 (Ventura 13.5.2) with Python 3.11.5:

Traceback (most recent call last):
  File "/Users/user/code/project/text-generation-webui/server.py", line 29, in <module>
    from modules import (
  File "/Users/user/code/project/text-generation-webui/modules/ui_default.py", line 3, in <module>
    from modules import logits, shared, ui, utils
  File "/Users/user/code/project/text-generation-webui/modules/logits.py", line 4, in <module>
    from modules.exllama import ExllamaModel
  File "/Users/user/code/project/text-generation-webui/modules/exllama.py", line 22, in <module>
    from generator import ExLlamaGenerator
  File "/Users/user/code/project/text-generation-webui/repositories/exllama/generator.py", line 1, in <module>
    import cuda_ext
  File "/Users/user/code/project/text-generation-webui/repositories/exllama/cuda_ext.py", line 43, in <module>
    exllama_ext = load(
                  ^^^^^
  File "/Users/user/code/project/text-generation-webui-venv/lib/python3.11/site-packages/torch/utils/cpp_extension.py", line 1284, in load
    return _jit_compile(
           ^^^^^^^^^^^^^
  File "/Users/user/code/project/text-generation-webui-venv/lib/python3.11/site-packages/torch/utils/cpp_extension.py", line 1509, in _jit_compile
    _write_ninja_file_and_build_library(
  File "/Users/user/code/project/text-generation-webui-venv/lib/python3.11/site-packages/torch/utils/cpp_extension.py", line 1601, in _write_ninja_file_and_build_library
    extra_ldflags = _prepare_ldflags(
                    ^^^^^^^^^^^^^^^^^
  File "/Users/user/code/project/text-generation-webui-venv/lib/python3.11/site-packages/torch/utils/cpp_extension.py", line 1699, in _prepare_ldflags
    extra_ldflags.append(f'-L{_join_cuda_home("lib64")}')
                              ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/user/code/project/text-generation-webui-venv/lib/python3.11/site-packages/torch/utils/cpp_extension.py", line 2223, in _join_cuda_home
    raise EnvironmentError('CUDA_HOME environment variable is not set. '
OSError: CUDA_HOME environment variable is not set. Please set it to your CUDA install root.
make: *** [run-llama2-text-generation-webui] Error 1

My Mac doesn't have a GPU, so I don't have CUDA. How can I get past this error?

Ph0rk0z commented 9 months ago

I don't think it runs on metal. Only AMD/Nvidia so far.

turboderp commented 9 months ago

I don't know the situation around running CUDA on Macs, if that's even possible, but yes, if you're trying to run it on Metal you definitely won't get very far. As far as I know llama.cpp is the only option for that right now.

jamesbraza commented 9 months ago

Okay thanks for answering, I follow. I guess consider adding to the README that macOS without CUDA is currently unsupported, otherwise feel free to close this out

MrOiseau commented 9 months ago

I'm getting this errors related to exllama when trying to load a model on Text gneration web UI on Macbook M1: """ 2023-09-20 00:55:59 WARNING:exllama module failed to import. Will attempt to import from repositories/. 2023-09-20 00:55:59 ERROR:Could not find repositories/exllama. Please ensure that exllama (https://github.com/turboderp/exllama) is cloned inside repositories/ and is up to date. 2023-09-20 00:55:59 ERROR:Failed to load the model. Traceback (most recent call last): File "/Users/.../text-generation-webui/modules/exllama.py", line 13, in from exllama.generator import ExLlamaGenerator ModuleNotFoundError: No module named 'exllama'

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/Users/.../text-generation-webui/modules/ui_model_menu.py", line 194, in load_model_wrapper shared.model, shared.tokenizer = load_model(shared.model_name, loader) File "/Users/.../text-generation-webui/modules/models.py", line 77, in load_model output = load_func_maploader File "/Users/.../text-generation-webui/modules/models.py", line 307, in ExLlama_loader from modules.exllama import ExllamaModel File "/Users/.../text-generation-webui/modules/exllama.py", line 22, in from generator import ExLlamaGenerator ModuleNotFoundError: No module named 'generator' """ Anyone know how to solve this one?

jamesbraza commented 9 months ago

@MrOiseau look in that stack trace near the top:

2023-09-20 00:55:59 ERROR:Could not find repositories/exllama. Please ensure that exllama (https://github.com/turboderp/exllama) is cloned inside repositories/ and is up to date.

Then look here: https://github.com/oobabooga/text-generation-webui#amd-metal-intel-arc-and-cpus-without-avx2

cd text-generation-webui
git clone https://github.com/turboderp/exllama repositories/exllama

In other words, I think you need to clone this repo to text-generation-webui/repositories/

Also, technically your issue is outside the scope of this issue and this repo

shivam-51 commented 9 months ago

@jamesbraza Did you find any solution to this, or is it not possible to run Exllama on mac m1?

MrOiseau commented 9 months ago

I think it's possible, just didn't have time to tackle it focused 😄 When I got some time, and fix it, will post here how did I solve it

shivam-51 commented 9 months ago

I think it's possible, just didn't have time to tackle it focused 😄 When I got some time, and fix it, will post here how did I solve it

Can you share which implementation of llama are you using if you are?