After following the llama.cpp manual, it does not recognize CUDA.

Penguin5353 commented 1 year ago

Describe the bug

I followed the manual at this link. (https://github.com/oobabooga/text-generation-webui/wiki/llama.cpp-models) Here is the executable code I started with.

pip install -r requirements.txt -U

After running this code, It was able to recognize GGML models just fine. but not models like the generic .pt .safetensor.

Is there an existing issue for this?

[X] I have searched the existing issues

Reproduction

This is the code I ran when I added the llama.cpp module.

conda activate textgen cd C:\Users\KHJ\text-generation-webui pip install -r requirements.txt -U

Here's the code when I ran the model as usual after adding the llama.cpp module.

conda activate textgen cd C:\Users\KHJ\text-generation-webui python server.py --model Alpaca-native-7b-4bit --wbits 4 --groupsize 128 --extensions api google_translate whisper_stt silero_tts elevenlabs_tts --no-stream --chat

Screenshot

No response

Logs

(textgen) C:\Users\KHJ\text-generation-webui>python server.py --model Alpaca-native-7b-4bit --wbits 4 --groupsize 128 --extensions api google_translate whisper_stt silero_tts elevenlabs_tts --no-stream --chat
Loading Alpaca-native-7b-4bit...
CUDA extension not installed.
Found the following quantized model: models\Alpaca-native-7b-4bit\alpaca7b-4bit.pt
Loading model ...
Traceback (most recent call last):
  File "C:\Users\KHJ\text-generation-webui\server.py", line 350, in <module>
    shared.model, shared.tokenizer = load_model(shared.model_name)
  File "C:\Users\KHJ\text-generation-webui\modules\models.py", line 103, in load_model
    model = load_quantized(model_name)
  File "C:\Users\KHJ\text-generation-webui\modules\GPTQ_loader.py", line 136, in load_quantized
    model = load_quant(str(path_to_model), str(pt_path), shared.args.wbits, shared.args.groupsize, kernel_switch_threshold=threshold)
  File "C:\Users\KHJ\text-generation-webui\modules\GPTQ_loader.py", line 63, in _load_quant
    model.load_state_dict(torch.load(checkpoint), strict=False)
  File "C:\Users\KHJ\miniconda3\envs\textgen\lib\site-packages\torch\serialization.py", line 809, in load
    return _load(opened_zipfile, map_location, pickle_module, **pickle_load_args)
  File "C:\Users\KHJ\miniconda3\envs\textgen\lib\site-packages\torch\serialization.py", line 1172, in _load
    result = unpickler.load()
  File "C:\Users\KHJ\miniconda3\envs\textgen\lib\site-packages\torch\serialization.py", line 1142, in persistent_load
    typed_storage = load_tensor(dtype, nbytes, key, _maybe_decode_ascii(location))
  File "C:\Users\KHJ\miniconda3\envs\textgen\lib\site-packages\torch\serialization.py", line 1116, in load_tensor
    wrap_storage=restore_location(storage, location),
  File "C:\Users\KHJ\miniconda3\envs\textgen\lib\site-packages\torch\serialization.py", line 217, in default_restore_location
    result = fn(storage, location)
  File "C:\Users\KHJ\miniconda3\envs\textgen\lib\site-packages\torch\serialization.py", line 182, in _cuda_deserialize
    device = validate_cuda_device(location)
  File "C:\Users\KHJ\miniconda3\envs\textgen\lib\site-packages\torch\serialization.py", line 166, in validate_cuda_device
    raise RuntimeError('Attempting to deserialize object on a CUDA '
RuntimeError: Attempting to deserialize object on a CUDA device but torch.cuda.is_available() is False. If you are running on a CPU-only machine, please use torch.load with map_location=torch.device('cpu') to map your storages to the CPU.

System Info

Windows 10 64bit. (Not linux, Not ubuntu, Not Wsl)
CPU: Ryzen 5600x
Ram: ddr4 16gb (8gb*2)
GPU: RTX3070 8gb

crazyblok271 commented 1 year ago

yea i had .cuda models working before but then they did an update to the webui and its broken now. so annoying

RedNax67 commented 1 year ago

Did you set this (when on wsl)?

export LD_LIBRARY_PATH=/usr/lib/wsl/lib:$LD_LIBRARY_PATH

Penguin5353 commented 1 year ago

Did you set this (when on wsl)?

export LD_LIBRARY_PATH=/usr/lib/wsl/lib:$LD_LIBRARY_PATH

No, I don't use wsl.

Penguin5353 commented 1 year ago

I reinstalled via the one-click installer and that resolved the issue, but I'll keep the issue open for now as this doesn't mean a complete resolution of the issue.

Patjwmiller commented 1 year ago

sometimes you gotta reinstall requirements when things get updated.

oobabooga / text-generation-webui