Maximilian-Winter / llama-cpp-agent

The llama-cpp-agent framework is a tool designed for easy interaction with Large Language Models (LLMs). Allowing users to chat with LLM models, execute structured function calls and get structured output. Works also with models not fine-tuned to JSON output and function calls.
Other
445 stars 38 forks source link

Avoid generating newlines and spaces rules in the GBNF grammar #64

Closed imalros closed 1 month ago

imalros commented 2 months ago

Maybe the feature is already there, but what I am looking for is to generate a GBNF from my Pydantic models, which avoids formatting the json with newline and space characters to optimize the llm response time. Is there a way to disable generating newline and space character in the generated grammar using gbnf_grammar_from_pydantic_models.py? If not, is there a reason that the spaces and newlines should be included?

Maximilian-Winter commented 2 months ago

@imalros The reason for the spaces is that otherwise some models will generate endless zeroes at the end of floating point numbers and other issues like that. But I can try to look into a optional way to do it like you said.

imalros commented 2 months ago

That would be awesome!

I have no idea how to read the GBNF grammar files, but from a 10-minutes playing with gbnf_grammar_from_pydantic_models.py, I found out all the newlines and the majority of spaces are created by ws ::= ([ \t\n]+) at line 803. Changing it to ws ::= () removes all the newlines and most of the spaces, but it also creates a few anomalies.

Maximilian-Winter commented 2 months ago

@imalros I didn't thought about that, but I think you are right about the ws ::= ([ \t\n]+) You should be able to change it to ws ::= "" To disable all spaces in the JSON structure.