Llama-cpp-python 0.2.81 'already loaded' fails to load models

Patronics commented 4 days ago

Describe the bug

Attempting to load a model after running the update-wizard-macos today (the version from a day or two ago worked fine) fails with the stack trace log included below.

Notably, the error message references this new issue in llama-cpp-python.

Is there an existing issue for this?

[X] I have searched the existing issues

Reproduction

Run the update wizard to update software.
Attempt to load a gguf model using the GPU and llama.cpp
Observe that loading fails.

Screenshot

Screenshot 2024-07-04 at 11 10 47 PM

Logs

Traceback (most recent call last):
  File "/Users/patrickleiser/Documents/Programming/AI/text-generation-webui/modules/ui_model_menu.py", line 246, in load_model_wrapper
    shared.model, shared.tokenizer = load_model(selected_model, loader)
  File "/Users/patrickleiser/Documents/Programming/AI/text-generation-webui/modules/models.py", line 94, in load_model
    output = load_func_map[loader](model_name)
  File "/Users/patrickleiser/Documents/Programming/AI/text-generation-webui/modules/models.py", line 275, in llamacpp_loader
    model, tokenizer = LlamaCppModel.from_pretrained(model_file)
  File "/Users/patrickleiser/Documents/Programming/AI/text-generation-webui/modules/llamacpp_model.py", line 39, in from_pretrained
    LlamaCache = llama_cpp_lib().LlamaCache
  File "/Users/patrickleiser/Documents/Programming/AI/text-generation-webui/modules/llama_cpp_python_hijack.py", line 38, in llama_cpp_lib
    raise Exception(f"Cannot import 'llama_cpp_cuda' because '{imported_module}' is already imported. See issue #1575 in llama-cpp-python. Please restart the server before attempting to use a different version of llama-cpp-python.")
Exception: Cannot import 'llama_cpp_cuda' because 'llama_cpp' is already imported. See issue #1575 in llama-cpp-python. Please restart the server before attempting to use a different version of llama-cpp-python.

System Info

M1 Max Macbook Pro, MacOS 14.5

Edit: Just realized that Ooobabooga was the one that created that issue on the llama-cpp-python project, so I guess this error was already known. Sorry if this issue is therefore somewhat redundant

gmarkley-VI commented 4 days ago

I am seeing the same exception: System Info M3 Max Macbook Pro, MacOS 14.5

phr00t commented 1 day ago

Same here. My GPU doesn't seem to be processing GGUFs at all. Oobabooga is broken as far as I'm concerned until this is fixed.

This isn't just a Mac issue, I'm on Windows 11.

danielw97 commented 1 day ago

I'm also experiencing this, although on Linux in my case. If there's any testing or troubleshooting that can be done to try to narrow this down let me know.

poopro commented 12 hours ago

I think oobabooga is working with abetlen. for his solution, Currently it is no solution wait. Any idea if I user previous archive will make it work agin ? Since the work also stopped for 4 days.

dgamer0775 commented 3 hours ago

Describe the bug

Attempting to load a model after running the update-wizard-macos today (the version from a day or two ago worked fine) fails with the stack trace log included below.

Notably, the error message references this new issue in llama-cpp-python.

Is there an existing issue for this?

[x] I have searched the existing issues

Reproduction

Run the update wizard to update software.

Attempt to load a gguf model using the GPU and llama.cpp

Observe that loading fails.

Screenshot

Logs
Traceback (most recent call last):
  File "/Users/patrickleiser/Documents/Programming/AI/text-generation-webui/modules/ui_model_menu.py", line 246, in load_model_wrapper
    shared.model, shared.tokenizer = load_model(selected_model, loader)
  File "/Users/patrickleiser/Documents/Programming/AI/text-generation-webui/modules/models.py", line 94, in load_model
    output = load_func_map[loader](model_name)
  File "/Users/patrickleiser/Documents/Programming/AI/text-generation-webui/modules/models.py", line 275, in llamacpp_loader
    model, tokenizer = LlamaCppModel.from_pretrained(model_file)
  File "/Users/patrickleiser/Documents/Programming/AI/text-generation-webui/modules/llamacpp_model.py", line 39, in from_pretrained
    LlamaCache = llama_cpp_lib().LlamaCache
  File "/Users/patrickleiser/Documents/Programming/AI/text-generation-webui/modules/llama_cpp_python_hijack.py", line 38, in llama_cpp_lib
    raise Exception(f"Cannot import 'llama_cpp_cuda' because '{imported_module}' is already imported. See issue #1575 in llama-cpp-python. Please restart the server before attempting to use a different version of llama-cpp-python.")
Exception: Cannot import 'llama_cpp_cuda' because 'llama_cpp' is already imported. See issue #1575 in llama-cpp-python. Please restart the server before attempting to use a different version of llama-cpp-python.
System Info
M1 Max Macbook Pro, MacOS 14.5
Edit: Just realized that Ooobabooga was the one that created that issue on the llama-cpp-python project, so I guess this error was already known. Sorry if this issue is therefore somewhat redundant

on oobabooga webui goto models page and select a model then scroll down and check on the cpu if its unchecked, click it and check it, your issue will be resolved and model successfully load,

phr00t commented 3 hours ago

on oobabooga webui goto models page and select a model then scroll down and check on the cpu if its unchecked, click it and check it, your issue will be resolved and model successfully load,

... but I don't want to load and run my model just on my CPU.

oobabooga / text-generation-webui