[BUG] - llama-cpp will not load local models

Cinnamon / kotaemon

An open-source RAG-based tool for chatting with your documents.

Apache License 2.0

13.21k stars 987 forks source link

Description

Have tried a number of huggingface models and consistently get the error message: llama_model_load: error loading model: done_getting_tensors: wrong number of tensors; expected 292, got 291

This appears to be an old bug that was fixed months ago in llama-cpp. Is it possible your run_linux script is installing an older version of llama-cpp (and/or its python server)?

Reproduction steps

Change env var for LOCAL_MODEL to a downloaded llama-3.1-8B...gguf model from huggingface.
execute run_linux.sh

Screenshots

No response

Logs

No response

Browsers

Other

OS

Linux

Additional information

No response

Cinnamon / kotaemon