Maximilian-Winter / llama-cpp-agent

The llama-cpp-agent framework is a tool designed for easy interaction with Large Language Models (LLMs). Allowing users to chat with LLM models, execute structured function calls and get structured output. Works also with models not fine-tuned to JSON output and function calls.
Other
472 stars 42 forks source link

Chat format ignored #53

Closed woheller69 closed 4 months ago

woheller69 commented 5 months ago

Trying example: chatbot_using_local_model.py with WizardLM2 (WizardLM-2-7B.Q8_0.gguf) gives:

Using fallback chat format: None
User: 

but the example defines CHATML as format:

predefined_messages_formatter_type=MessagesFormatterType.CHATML

is chat format ignored?

woheller69 commented 5 months ago

Switching on debug output it seems to be applied correctly. Does the None come from llama-cpp-python or from llama-cpp-agent?

woheller69 commented 5 months ago

Regarding order: 1) tries to get template from gguf 2) tries to guess format 3) uses predefined_messages_formatter

Right?

Maximilian-Winter commented 5 months ago

@woheller69 I currently use my own messages formatter system and the completion endpoints, not the chat completion.

I think the Using fallback chat format: None is from llama-cpp-python.

You can ignore it, since I don't use the chat completion endpoints.

woheller69 commented 5 months ago

OK, this means it has to be always specified. It would be nice if in future there was an option to apply the format from the gguf because it is often difficult to find out which template is required. And the newer ggufs usually contain the information...

Maximilian-Winter commented 4 months ago

@woheller69 I will try to make this possible