Maximilian-Winter / llama-cpp-agent

The llama-cpp-agent framework is a tool designed for easy interaction with Large Language Models (LLMs). Allowing users to chat with LLM models, execute structured function calls and get structured output. Works also with models not fine-tuned to JSON output and function calls.
Other
472 stars 42 forks source link

[bug] `new_line` and `space` may give the LLM too much flexibility and lead it astray #12

Closed ibehnam closed 5 months ago

ibehnam commented 8 months ago

I've been working with Mistral and Mixtral models and what I've noticed is that the grammar gives the models too much flexibility, which results in numerous cases where the LLM generates infinitely many spaces or new lines.

(I'm using your grammar example available in llama.cpp examples.)

Maximilian-Winter commented 8 months ago

@ibehnam Yes, I thought the problem is gone, but I gave it so much flexibility because otherwise weird problems arise, like the llm adding endless zeroes to a number, always writing float number even when give a integer number etc. I will try to write a grammar generator that defines only necessary whitespaces, and gives the llm the option to use a space or linebreak. Right now it is free to generate as much whitespaces it wants.

Maximilian-Winter commented 8 months ago

But that takes some time, I will try to do this. I think I will do it this month.

ibehnam commented 8 months ago

Thanks, I know it's a lot of work, and I'd appreciate it. Didn't know about the other weird problems that could arise without giving more flexibility to the LLM. The json.gbnf example on llama.cpp also uses ws, but in that example, two wss can't appear next to each other. It looks like the source of the problem is that llama-cpp-agent allows for that to happen.

Maximilian-Winter commented 7 months ago

@ibehnam Hi, can you take a look at your issue with the latest commit? I think I found the problem and fixed it.

ibehnam commented 7 months ago

@Maximilian-Winter Thanks a lot! Sure, I'll check it out as soon as I can and will update here.

Maximilian-Winter commented 6 months ago

@ibehnam Did you manage to check the new version?

ibehnam commented 6 months ago

@Maximilian-Winter Hi, yes I actually just tried it again. It's definitely gotten better than before, but sometimes even simple Pydantic classes like the following lead to errors with the latest llama.cpp server:

class Bio(BaseModel):
    first_name: str = Field(default=..., description="The person's first name")
    last_name: str = Field(default=..., description="The person's last name")
    age: int = Field(default=..., description="The person's age")

I get bad request errors (10% of the time). I haven't been streaming the responses so I'm not sure if it's due to infinite generations of \n/<space>.

Maximilian-Winter commented 6 months ago

@ibehnam Thank you for the model, I could reproduce the issue and fixed it in the repo. The problem was an line break followed by whitespace

ibehnam commented 6 months ago

Thank you! I will also test it on a bit more advanced classes and update here if there's any issues.