getumbrel / llama-gpt

A self-hosted, offline, ChatGPT-like chatbot. Powered by Llama 2. 100% private, with no data leaving your device. New: Code Llama support!
https://apps.umbrel.com/app/llama-gpt
MIT License
10.73k stars 696 forks source link

Assertion Error #68

Closed Whiskey-Bravo closed 1 year ago

Whiskey-Bravo commented 1 year ago

When i try to run the "docker compose up" command it downloads the model and then throws an AssertionError. I have tried deleting the model manually multiple times, but it still doesnt seem to work.

image

1Wayne1 commented 1 year ago

same issues

RichardScottOZ commented 1 year ago

Yes, just tried this - same here -70b model as well.

RichardScottOZ commented 1 year ago

Model works in llama.cpp so perhaps this is a new config in the model issue?

THeivers commented 1 year ago

Seems to be a problem with the latest version of llama.cpp from what I can gather in https://huggingface.co/TheBloke/Llama-2-13B-chat-GGML/discussions/14

I got it working locally (for now until the core problem is resolved) by following the suggestion to go back to an older version, so I replaced the image version with an older one.

For example

 llama-gpt-api-7b:
    image: ghcr.io/abetlen/llama-cpp-python:latest

to

llama-gpt-api-7b:
    image: ghcr.io/abetlen/llama-cpp-python@sha256:b6d21ff8c4d9baad65e1fa741a0f8c898d68735fff3f3cd777e3f0c6a1839dd4

This one was about 9 days ago (there were newer ones, but most were about a day old, so figured they might all share the same problem). So you could always try a later one than that. You can see all of them here: https://github.com/abetlen/llama-cpp-python/pkgs/container/llama-cpp-python/versions?filters%5Bversion_type%5D=untagged&page=1

mayankchhabra commented 1 year ago

Sorry for the confusion, folks! This was resolved as a part of #71 which added support for Code Llama models. It should be fixed in the master branch now. You were correct in your assessment @THeivers.

You can retry with:

git pull origin master
./run.sh --model 7b # or run this if you are on an M1/M2 mac: ./run-mac.sh --model 7b

Replace 7b with 13b, 70b, code-7b, code-13b, or code-34b.

RichardScottOZ commented 1 year ago

Thanks - llama.cpp is now GGUF by default and not GGML as I understand getting this working yesterday?