getumbrel / llama-gpt

A self-hosted, offline, ChatGPT-like chatbot. Powered by Llama 2. 100% private, with no data leaving your device. New: Code Llama support!
https://apps.umbrel.com/app/llama-gpt
MIT License
10.73k stars 696 forks source link

code-13b ggml-metal issue - macOS M1 Pro #88

Closed mostlyvirtual closed 1 year ago

mostlyvirtual commented 1 year ago

Hello. ./run-mac.sh works well with 7b and 13b, but when trying to start it with --model code-13b it looks like it starts well, loads up the mode, but then on the first query it returns this error:

ggml_metal_graph_compute: command buffer 0 failed with status 5 GGML_ASSERT: /private/var/folders/bm/qs9__78s4c77966vtyyczhc40000gn/T/pip-install-3x1cahig/llama-cpp-python_9a6cdba886054cafbfcf5301b89491f6/vendor/llama.cpp/ggml-metal.m:1

After that error, the API crashes and doesn't recover.

mostlyvirtual commented 1 year ago

I think it might be a RAM limitation - you can close.

lukechilds commented 1 year ago

Thanks for the update!