[Failed to load model] under Native models

Mintplex-Labs / anything-llm

The all-in-one Desktop & Docker AI application with built-in RAG, AI agents, and more.

MIT License

27.91k stars 2.83k forks source link

There is not enough information to possibly assist here. What kind of model did you load? LoRA? Q_? Second, that version of llama-cpp will not bind to your GPU most likely and inference will be slow unless on a unified memory system (M1,M2,M3).

Lastly, we are going to be removing the built-in native runner because of issues like this and the lack of GPU support, we will be reinventing technology that is better handled by other local LLM runners like Ollama, LMStudio, or LocalAI. Whatever your model is, it will likely be easier to run and get inferencing via Ollama and using that connection in AnythingLLM

Mintplex-Labs / anything-llm

[Failed to load model] under Native models #1925