Can't quantize the model using LLama.cpp

Codedestructor56 commented 6 months ago

Encountered an error while attempting to quantize a model using the ./quantize command. The quantization process failed with the following error message:


main: quantizing './models/llama_model/ggml-model-f32.gguf' to './model/llama_model/ggml-model-Q4_K_M.gguf' as Q4_K_M
llama_model_loader: loaded meta data with 20 key-value pairs and 291 tensors from ./models/llama_model/ggml-model-f32.gguf (version GGUF V3 (latest))
llama_model_quantize: failed to quantize: basic_ios::clear: iostream error
main: failed to quantize model from './models/llama_model/ggml-model-f32.gguf' ```

The error occurred during an attempt to quantize the specified model file. Prior to the error, the loading process of the model metadata was successful, as indicated by the log message. However, during the actual quantization process, an unexpected error occurred, suggesting an issue with the input/output stream. Further investigation is needed to diagnose and address the underlying cause of this error.

Codedestructor56 commented 6 months ago

Working on this issue, inside llama.cpp's guts rn. Will prolly figure something out. In the meantime, any help would be appreciated.

Codedestructor56 commented 6 months ago

The issue: I wrote "model" instead of "models", sorry about that :)

meta-llama / llama3

Can't quantize the model using LLama.cpp #130