Closed aseok closed 1 year ago
You can find the instructions here: https://github.com/bigcode-project/starcoder.cpp#quick-start
I have downloaded model files separately and skip downloading them in convert-hf-to-ggml.py. my problem is in quantization and probably running inference, how to pass the model files in quantization command? Should I rename them?
how to pass the model files in quantization command?
for the sharded model conversion don't pass the filenames, pass the directory:
$ python convert-hf-to-ggml.py ./starcoder
then quantization is as in the README example:
$ ./quantize starcoder-ggml.bin starcoder-ggml-q4_1.bin 3
Does that fix your issue @aseok?
Hi. Pls provide conversion and quantization instructions of the main starcoder model files.