abetlen / llama-cpp-python

Python bindings for llama.cpp
https://llama-cpp-python.readthedocs.io
MIT License
8.14k stars 967 forks source link

loading error in llama cpp /llama2 #653

Open Bhavya-TR opened 1 year ago

Bhavya-TR commented 1 year ago

llama.cpp: loading model from models\llama-2-7b-chat.ggmlv3.q8_0.bin error loading model: unknown (magic, version) combination: 67676a74, 00000003; is this really a GGML file?
llama_init_from_file: failed to load model

AlessandroSpallina commented 1 year ago

you should use GGML models with older versions of llama-cpp-python (this docker image is working ghcr.io/abetlen/llama-cpp-python@sha256:b6d21ff8c4d9baad65e1fa741a0f8c898d68735fff3f3cd777e3f0c6a1839dd4)

use GGUF model formats for newer versions instead

bash-bandicoot commented 1 year ago

Or convert:

  1. Download the conversion script from llama.cpp (https://github.com/ggerganov/llama.cpp/blob/master/convert-llama-ggmlv3-to-gguf.py)
  2. pip install gguf --force-reinstall --upgrade --no-cache-dir
  3. python convert-llama-ggmlv3-to-gguf.py -i ggml_model.bin -o gguf_model.bin --gqa 8 -c 4096