Closed EC-Sol closed 2 months ago
it's simple too large for the models. you may split or offload with gup and cpu manually. as explained here https://github.com/ollama/ollama/issues/6595#issuecomment-2329425060 , however , it's seems this issue from llama.cpp .need fixed by llama.cpp first . you may still try run manually even it's may very slow.
What is the issue?
When I try
ollama run llama3.1:70b
, occur errorError: llama runner process has terminated: error loading model: unable to allocate backend buffer
My Env:
Here is full logs of ollama sever.
However
ollama run llama3.1:8b
works good.OS
Windows
GPU
AMD
CPU
AMD
Ollama version
0.3.10-0-g486ae43