Feature - Add support for GGUF spec to read necessary details directly from the model

crimson-knight / llamero

A wrapper shard for llama.cpp that acts as a client to work directly with AI models through llama.cpp from within Crystal applications

MIT License

12 stars 0 forks source link

Feature - Add support for GGUF spec to read necessary details directly from the model #1

Open crimson-knight opened 7 months ago

crimson-knight commented 7 months ago

Right now Llamero requires that you add the prompt template details yourself.

Not a big deal, but those details are already present in the model. So, using the GGUF spec, we should add support to read the initial bytes from the file to get the necessary information for the chat template.

This would be an excellent convenience.

GGUF Spec for reference: https://github.com/ggerganov/ggml/blob/master/docs/gguf.md

crimson-knight commented 3 months ago

This is partially implemented allowing fetching of certain special tokens for the chat prompt.