Maximilian-Winter / llama-cpp-agent

The llama-cpp-agent framework is a tool designed for easy interaction with Large Language Models (LLMs). Allowing users to chat with LLM models, execute structured function calls and get structured output. Works also with models not fine-tuned to JSON output and function calls.
Other
445 stars 38 forks source link

Return control after function executed #55

Closed ebarrragn closed 2 months ago

ebarrragn commented 2 months ago

I'd like to stop execution after a function has been executed, mostly to save the time taken by another LLM iteration (I have an old P40). Until now, I was telling the agent things like say '(End of message)' or so, but the final response was returned incomplete.

Looking into the code, it turns out there is a commentary talking about a 'return_control' flag that seems not implemented (correct me if I am wrong).

I have implemented it in function_calling_agent.py, line 396, method generate_response, just before the line if agent_sent_message::

                if not isinstance(res, str):
                    if "params" in res:
                        params = res["params"]
                        if "return_control" in params:
                            if params["return_control"]:
                                break

I can try a pull request if that is fine.

ebarrragn commented 2 months ago

One example of use, I have left commented the original string I was using to return control:

class TaskReturnYesAndFinish(BaseModel):
    """
    Return 'Yes' then the task has been completed and there is nothing else to do.
    """
    return_control: bool

    def run(self, **kwargs):
        self.return_control = True
       # Using this string before
        # retval = "You answered 'Yes' and your task is completed, say: '<|end_of_turn|>' to finish."
        retval = "You answered 'Yes' and your task is completed."
        return retval
Maximilian-Winter commented 2 months ago

@ebarrragn You can use the normal LlamaCppAgent. It will just call the function and returns the result.

ebarrragn commented 2 months ago

Interesting, I may not understand it correctly. When I use it, depending on the model, they carry on calling functions until they decide to return control (1 to 4 iterations more), or they get into an infinite cycle, I've seen that using 'Phi-3-mini-4k-instruct-q4.gguf'. Am I doing something wrong when the agent doesn't return control after one single function call?

EDIT: The reason of the original modification (that I don't mind to keep in an independent patch) is that I am using a step to clean results from a search. the steps are:

1 - Search (currently in a Qdrant database and on Wikipedia) 2 - Rerank, 3 - Eliminate L last results. 4 - Perform a pass with Mixtral to every result asking if the result is really relevant to the question, the ones that are not, are purged out. - This is slow, and here is where the reduction in iterations makes an impact, hence the patch. 5 - return the remaining results.

The creators of queries and receivers of the results are also llama agents.

Just to clarify my reasons, as I said, given that the use case might be no so general, I don't mind keeping it as a local patch here.

Maximilian-Winter commented 2 months ago

@ebarrragn Could you show me your code? I'm curious why this happens.

ebarrragn commented 2 months ago

Sure, I'll prepare a case example with that part only, so you don't have to grok all over the code.

ebarrragn commented 2 months ago

Finally, I found some time. After the changes done last weekend, my code is quite broken, will have a look at it this weekend. I adapted one of the examples to show the issue, I just made the llm local using phi3 as model (I hope it's not nonsense) . function_calling_agent.py.txt

ebarrragn commented 2 months ago

The PHI3 problem seems related to llama.cpp.

Maximilian-Winter commented 2 months ago

You can simply use the LlamaCppAgent instead of the FunctionCallingAgent to return controll immeadeatly

ebarrragn commented 2 months ago

Yes, the phi3 bug confused me because even using the LlamaCppAgent, it wasn't stopping, but actually it seems to be a llama.cpp problem. I am closing this issue.