Closed h4rk8s closed 1 year ago
I'm finding same as well
I had a similar problem when my model file had not fully downloaded. Perhaps check the file size.
I've not tried the wizardcoder model, but I notice it says:
Please note that these GGMLs are not compatible with llama.cpp
I've encountered this issue as well – I've found downloading llama.cpp repo and running the following command works on my M2 Ultra:
make -j && ./main -t 20 -ngl 40 -m "$(llm llama-cpp models-dir)/llama-2-70b-chat.ggmlv3.q5_K_M.bin" \
-p "Building a website can be done in 10 simple steps:" --color -c 2048 --temp 0.7 --repeat_penalty 1.1 --no-mmap --ignore-eos -n 64 -gqa 8
I've pieced together from elsewhere the issue with these quantized models is related to the -gqa 8
flag.
When I run:
llm -m llama-2-70b-chat.ggmlv3.q5_K_M "hello world"
I just get Error:
in response.
Step 1 : download-model
Step2 : prompt it
Error happend.