Following simonw's blog yields `Error:`

hacker-DOM commented 1 year ago

Hi! I have followed every step in Run Llama 2 on your own Mac using LLM and Homebrew, in particular:

pipx install llm # python 3.11
llm install llm-llama-cpp
llm install llama-cpp-python
llm llama-cpp download-model \
  https://huggingface.co/TheBloke/Llama-2-7B-Chat-GGML/resolve/main/llama-2-7b-chat.ggmlv3.q8_0.bin \
  --alias llama2-chat --alias l2c --llama2-chat

Next, I ran

❯ llm -m l2c 'Tell me a joke about a llama'
Error:

Any ideas how I can debug this issue?

Up: I am on MacOs Sonoma (14.0), M1 arch.

mhaeussermann commented 1 year ago

Since the blog post was written (Aug 2023), llama.cpp switched to a new file format (GGUF instead of GGML). GGML is no longer supported by llama.cpp, so that might be the issue. Try it with the equivalent GGUF file: https://huggingface.co/TheBloke/Llama-2-7b-Chat-GGUF/resolve/main/llama-2-7b-chat.Q8_0.gguf.

terwey commented 1 year ago

Is there no way to switch to a compatible llama.cpp?

I'm on a metered and slow connection and it already took 2 tries (because llm doesn't yet support resume which HF does provide) to receive the file...

mhaeussermann commented 1 year ago

If you already have the ggml file, you should be able convert it using the convert-llama-ggml-to-gguf.py script. You also don't need to download the model using llm for it to work with llm, as you can just put the file in your models folder and edit the models.json to include it. Going back to a "compatible" version might introduce a bunch of issues with dependencies.

faizanbashir commented 1 year ago

I downloaded a GGUF model, but it still gives the Error: message without any detailed error message.

llm llama-cpp download-model \
  https://huggingface.co/TheBloke/Llama-2-7B-Chat-GGUF/resolve/main/llama-2-7b-chat.Q8_0.gguf --alias llama2-chat --alias l2c --llama2-chat

After installing, I tried to run the below two commands, but both returned the same Error:

llm prompt -m l2c 'Tell me a joke about a llama'
llm -m l2c 'Tell me a joke about a llama'

simonw commented 1 year ago

Could you run this command and paste in the output?

llm plugins

You should have 0.2b1 of llm-llama-cpp - otherwise you'll need to upgrade it with:

llm install -U llm-llama-cpp

You may also have better success with https://github.com/simonw/llm-gpt4all - the most recent release of that has been giving me really good results.

sjain74 commented 9 months ago

I was seeing the same "Error:" issue with ggml model. After downloading and using the gguf model, I am seeing:

/Users/Sanjay/Documents/workspaces/AI/llama-2-mac-simonwillison> llm -m l2c 'Tell me a joke about a llama' Abort /Users/Sanjay/Documents/workspaces/AI/llama-2-mac-simonwillison>

Additional notes:

It takes some time before printing the "Abort" message.
/Users/Sanjay/Documents/workspaces/AI/llama-2-mac-simonwillison> llm plugins [ { "name": "llm-gpt4all", "hooks": [ "register_models" ], "version": "0.3" }, { "name": "llm-llama-cpp", "hooks": [ "register_commands", "register_models" ], "version": "0.3" } ]
It seems "Python Quit Unexpectedly" on my mac is the reason for Abort, with the following thread possibly causing it:

Thread 6 Crashed:: Dispatch queue: ggml-metal 0 libsystem_kernel.dylib 0x7ff8138ebffe __pthread_kill + 10 1 libsystem_pthread.dylib 0x7ff8139221ff pthread_kill + 263 2 libsystem_c.dylib 0x7ff81386dd24 abort + 123 3 libllama.dylib 0x10facafe8 0x10f9d2000 + 1019880 4 libllama.dylib 0x10fa5ac14 0x10f9d2000 + 560148 5 libdispatch.dylib 0x7ff81376b34a _dispatch_client_callout2 + 8 6 libdispatch.dylib 0x7ff81377cb8b _dispatch_apply_redirect_invoke + 418 7 libdispatch.dylib 0x7ff81376b317 _dispatch_client_callout + 8 8 libdispatch.dylib 0x7ff81377ac0c _dispatch_root_queue_drain + 673 9 libdispatch.dylib 0x7ff81377b25c _dispatch_worker_thread2 + 160 10 libsystem_pthread.dylib 0x7ff81391ef8a _pthread_wqthread + 256 11 libsystem_pthread.dylib 0x7ff81391df57 start_wqthread + 15

simonw / llm

Following simonw's blog yields `Error:` #309