withcatai / node-llama-cpp

Run AI models locally on your machine with node.js bindings for llama.cpp. Enforce a JSON schema on the model output on the generation level
https://node-llama-cpp.withcat.ai
MIT License
983 stars 91 forks source link

Failed to load model #11

Closed nigel-daniels closed 1 year ago

nigel-daniels commented 1 year ago

I am trying to load a local model Llama2_7B built using llama.cpp then quantized to q4_0. When I attempt to load the model I get the error:

error loading model: unknown (magic, version) combination: 46554747, 00000001; is this really a GGML file?

To use a quantized model are there parameters I need to set or is something else preventing this model from loading? For my test I am just passing in the path.

nigel-daniels commented 1 year ago

Addendum

Here are the commands I am using to generate my model:

python3 convert.py --outfile models/7B/ggml-model-f16.bin --outtype f16 ../../llama2/llama/llama-2-7b --vocab-dir ../../llama2/llama/llama-2-7b
./quantize  ./models/7B/ggml-model-f16.bin ./models/7B/ggml-model-q4_0.bin q4_0
nigel-daniels commented 1 year ago

Fastest fix ever... 2.0.0 released 1 hr after I posted this fixed the issue!!

Thanks.