Open swapneils opened 1 year ago
Yes, I am also encountering the same issue even with simple built-in tools in cases where I removed the tool, but they still run the command, so the output is a predictable error message, but their response can be unpredictable. Sometimes they acknowledge the error and attempt a different action, but I have recorded many cases where they hallucinate it was a success or misinterpret a response as a success. The misinterpretation happens very often if it was a command run by a delegate agent/subagent.
With the misinterpreted command responses from delegate agents spawned through the create_agent
command, there are two recurring cases that I have observed and recorded:
mesage_agent
command but they send the request to an invalid agent ID, and misinterpret the 'agent not found' error as a failure, or sometimes bizarrely as a success.I have noticed that the agent will sometimes misinterpret their own commands too. I think the issue to these problems lie somewhere in the following code blocks: https://github.com/farizrahman4u/loopgpt/blob/main/loopgpt/agent.py#L154-L181 https://github.com/farizrahman4u/loopgpt/blob/main/loopgpt/agent.py#L287-L340
Please check that this issue hasn't been reported before.
Expected Behavior
I ran the sample for getting weather information, but without setting up the fetch_weather command, which it tried to run anyway. Ideally, the system would notice that fetch_weather failed and construct an alternate plan not using the OpenWeatherMap API, and then continue with the rest of the goals (getting dressing tips and writing them to dressing_tips.txt).
Current behaviour
Instead, the system pretended that it was successful, and said it had "not been given any new commands since the last time [it] provided an output", choosing the do_nothing action.
Steps to reproduce
Run the WeatherGPT example on the README, but comment out the code setting up GetWeather.
Possible solution
I don't see any places where the system itself is being asked whether it has completed a step of the plan. Maybe add that to the prompt, or add a "cheap model" (Auto-GPT uses ada, as I recall) to evaluate this based on the output and the plan step?
Which Operating Systems are you using?
Python Version
LoopGPT Version
latest
Acknowledgements