Right now Llamero requires that you add the prompt template details yourself.
Not a big deal, but those details are already present in the model. So, using the GGUF spec, we should add support to read the initial bytes from the file to get the necessary information for the chat template.
Right now Llamero requires that you add the prompt template details yourself.
Not a big deal, but those details are already present in the model. So, using the GGUF spec, we should add support to read the initial bytes from the file to get the necessary information for the chat template.
This would be an excellent convenience.
GGUF Spec for reference: https://github.com/ggerganov/ggml/blob/master/docs/gguf.md