Closed nigel-daniels closed 1 year ago
Addendum
Here are the commands I am using to generate my model:
python3 convert.py --outfile models/7B/ggml-model-f16.bin --outtype f16 ../../llama2/llama/llama-2-7b --vocab-dir ../../llama2/llama/llama-2-7b
./quantize ./models/7B/ggml-model-f16.bin ./models/7B/ggml-model-q4_0.bin q4_0
Fastest fix ever... 2.0.0 released 1 hr after I posted this fixed the issue!!
Thanks.
I am trying to load a local model Llama2_7B built using llama.cpp then quantized to q4_0. When I attempt to load the model I get the error:
To use a quantized model are there parameters I need to set or is something else preventing this model from loading? For my test I am just passing in the path.