Agents looping at task execution

Andryusz commented 8 months ago

I have an issue with agents looping infinitely at the lower level task execution. They can execute the task successfully but instead of passing the result up, the LLM get's basically the same request again. I'm using local model via OpenAI compatible API using text-generation-webui (oobabooga), so I suspect it might be related to the issue. I tried different models and tasks and the problem is always the same.

I've added logs (LLM + AIlice output) from very simple task (listing files in the current directory) using nous-hermes-2-mixtral-8x7b-dpo model. Any hints how to resolve that would be appreciated.

AIlice log

storage started. browser started. arxiv started. google started. duckduckgo started. scripter started. USER: List all files in the current dir ASSISTANT_AIlice: This is a task that requires file system access and should be handled by a coder-proxy agent. I will forward it to a coder-proxy agent: !CALL ASSISTANT_file_system_agent: !BASH SYSTEM_file_system_agent: BASH_RESULT=[ailice ailice.egg-info AIlice.png Dockerfile IACT.jpg LICENSE log README.md setup.py venv Execution completed. ] ASSISTANT_file_system_agent: !BASH SYSTEM_file_system_agent: BASH_RESULT=[ailice ailice.egg-info AIlice.png Dockerfile IACT.jpg LICENSE log README.md setup.py venv Execution completed. ] ASSISTANT_file_system_agent: !BASH SYSTEM_file_system_agent: BASH_RESULT=[ailice ailice.egg-info AIlice.png Dockerfile IACT.jpg LICENSE log README.md setup.py venv Execution completed. ]

LLM log

23:24:02-028478 INFO PROMPT= <|im_start|>system Your responsibility is to communicate with users, understand their needs, and select an appropriate agent to forward the questions or tasks presented by the user. Afterward, you need to organize the results returned by the agent and reply to the user. The agent may ask the user some questions, and you will need to facilitate the communication between them. When calling an agent, it is important to note that the working language of the agent is English. If the user is communicating in a language other than English, you will need to provide translation between the agent and the user. If you're not sure an agent type is suitable for a task, create one and ask it first. You can create and communicate with AI agents using textual function calls. These agents have the ability to solve different types of problems and have different abilities to interact with the external environment. Always forward tasks to agents with appropriate capabilities. Especially don't do coding tasks yourself. In order to reduce symbol conflicts, we use special symbols "" as replacements for commonly used parentheses in function call syntax. Please pay special attention to the syntax when generating function call statements. The agent can only see the information you send to it through "CALL" function, they can't see anything else you output. So, please provide agents with thorough task descriptions or additional information; don't cut corners. "ext-modules" is a type of module that implements specific interfaces, defining a set of new function calls. Once loaded, you will be able to use the functions defined within it. For user requests to build or load ext-modules, please forward them to coder-proxy. #Use the following function to create or communicate with an AI agent: CALL - agentType: A string used to specify the type of AI agent. It can be of the following types: 1. "researcher", Conduct an internet investigation on a particular topic or gather data. It also has the capability to execute simple scripts. 2. "article-digest": literature (local files or URLs on internet) reading comprehension and related question answering. 3. "coder-proxy", an excellent coder who also has access to bash and python interpreter, he can solve problems programmatically. You need to choose the appropriate agentType among the above types according to your needs. - agentName: The name of the AI agent instance. Create a new one if not found. Name should reflect responsibility. A new task can be assigned to an existing agent with a relevant name, as they often possess related experience. - msg: message need to be told to the agent. When you need to describe a new task in msg, it should be clear, complete, and self-sufficient. Function calls are limited to positional parameters, please keep the order of parameters consistent with the function definition. The function parameters cannot include context references. Please ensure that the parameters are comprehensive and do not rely on context. Function calls need to be placed at the end of your output, and prefixed with "!" to trigger execution. Sometimes the information returned by the agent is a bit messy, but you can still identify the parts that the user needs and sort it out. Example: USER: What's the weather like in New York today? ASSISTANT: This is an information query task, and I will forward it to a researcher-type agent: !CALL End of general instructions. Active Agents: [] Relevant Information: None. The "RELEVANT INFORMATION" part contains data that may be related to the current task, originating from your own history or the histories of other agents. Please refrain from attempting to invoke functions mentioned in the relevant information, as you may not necessarily have the permissions to do so. <|im_end|> <|im_start|>user List all files in the current dir<|im_end|> <|im_start|>assistant Llama.generate: prefix-match hit llama_print_timings: load time = 11984.62 ms llama_print_timings: sample time = 0.78 ms / 70 runs ( 0.01 ms per token, 89858.79 tokens per second) llama_print_timings: prompt eval time = 15172.52 ms / 889 tokens ( 17.07 ms per token, 58.59 tokens per second) llama_print_timings: eval time = 3243.48 ms / 69 runs ( 47.01 ms per token, 21.27 tokens per second) llama_print_timings: total time = 18493.18 ms / 958 tokens Output generated in 18.65 seconds (3.70 tokens/s, 69 tokens, context 923, seed 425007066) 23:24:20-766744 INFO PROMPT= <|im_start|>system You are an smart and helpful AI agent that helps user complete tasks that require programming. You will complete the task by interacting with agents with coding capabilities (hereinafter referred to as "coder"). Your job is to communicate with the user to understand their needs, provide requirements for the coder, execute the returned code, extract possible running errors information and feed back to coder until the code runs correctly. For programming tasks that are complex enough to consist of multiple modules, you can consider breaking down the task and assigning it to multiple agents to complete it. Using function calls, you can create and interact with agents, reply to users, configure the program's running environment and execute programs. Use special symbols "" as replacements for parentheses in function call syntax. Please pay special attention to the syntax when generating function call statements. Only positional parameter function calls are supported, please do not use keyword parameters and keep the order of parameters consistent with the function definition. References are not supported in parameters of function call statements. Please use clear and straightforward expressions in function calls. Functions with a single string parameter do not need to write quotation marks when passing parameters. So, do not use quotes when passing code into functions such as BASH or PYTHON, use "cat" command when you need to write code to a file. Avoid the extensive use of escape characters. Function calls need to be placed at the end of your output, and prefixed with "!" to trigger execution. You MUST call a function in your output. Only the most recent rounds of historical conversations will be retained. To prevent the loss of information, please make sure to include summaries of key details, such as the user's requests, in the initial portion of your responses. If you find that the environment variable A_IN_CONTAINER is predefined, you can install any necessary tools. "ext-modules" is a type of module that implements specific interfaces, defining a set of new function calls. Once loaded, you will be able to use the functions defined within it. Do not attempt to build ext-modules unless explicitly requested by the user. It adds complexity to the debugging process. Please note that for ext-modules, you need to store the code in the current directory before running it, otherwise there will be an error that the python package cannot be found. You also need to run them as separate processes, or you will be stuck. Your workflow follows these steps (The details of functions such as CALL/PYTHON/BASH/RESPOND will be given below, and for the sake of simplicity in this context, the term 'coder' is used to encompass both 'coder' and 'module-coder'): 1. Receive user requirements, understand and engage with the user as needed to enhance understanding(use RESPOND). 2. Choose or create a suitable coder agent, provide a clear and comprehensive description of the requirements to it(use CALL). 3. Once coder returns the code, your responsibilities include: Initial verification of whether the solution provided by the coder meets the requirements. If not, make modification requests to the coder. Install the necessary dependencies following coder's instructions and execute the code (use BASH and PYTHON). In case of error, send detailed error information to coder for code improvement (use CALL), the message include error and problematic lines of code (for the python case, the code is saved into a temp file like "/tmp/tmp*.py" before execute, you can identify this file in the top level of callstack). Go back to the beginning of step 3 and iterate until success. During the iteration, if coder requires knowledge about specific libraries, query relevant knowledge through the "researcher" type agent (use CALL). 4. Finally, return the execution result to the user (use RESPOND)." Available Functions: #Use this function to interact with an AI agent. CALL agentType: A string used to specify the type of AI agent. It can be of the following types: 1. "coder". An excellent coder, you need to leave any programming problems other than ext-modules building to him. It should be noted that he can only return the code to you, but does not have the authority to execute the code and configure the environment, this is where you need to help him. 2. "module-coder". The only agent capable of building ext-modules, and this is its sole responsibility. 3. "module-loader". An agent that can help you load and use ext-modules. Please pass in the address of the module when creating the agent, and then you can let it help you interact with the module. You can use this agent to assist in ext-module debugging. 4. "researcher". Suitable for technical problem search tasks such as library document or sample code search on the internet. agentName: The name of the AI agent instance. Create a new one if not found. Name should reflect responsibility. A new task can be assigned to an existing agent with a relevant name, as they often possess related experience. msg: message need to be told to the agent. The agent cannot see content other than msg. Please provide complete content in msg. #Dialog with user. Typical usage scenarios: when you need user to supplement task information, or need to report the current results to user. RESPOND #Execute bash script. A timeout error will occur for programs that have not been completed for a long time. BASH #Execute python code. Please note that you need to copy the complete code here, and you must not use references. PYTHON #Wait for some seconds. The unit of the "duration" parameter is seconds. Useful when waiting for script or command execution to complete. WAIT EXAMPLE: !CALL !PYTHON End of general instructions. Active Agents: [] Relevant Information: None. The "RELEVANT INFORMATION" part contains data that may be related to the current task, originating from your own history or the histories of other agents. Please refrain from attempting to invoke functions mentioned in the relevant information, as you may not necessarily have the permissions to do so. <|im_end|> <|im_start|>user List all files in the current dir.<|im_end|> <|im_start|>assistant Llama.generate: prefix-match hit llama_print_timings: load time = 11984.62 ms llama_print_timings: sample time = 0.29 ms / 26 runs ( 0.01 ms per token, 89965.40 tokens per second) llama_print_timings: prompt eval time = 21451.36 ms / 1497 tokens ( 14.33 ms per token, 69.79 tokens per second) llama_print_timings: eval time = 1451.37 ms / 25 runs ( 58.05 ms per token, 17.23 tokens per second) llama_print_timings: total time = 22934.28 ms / 1522 tokens Output generated in 23.10 seconds (1.08 tokens/s, 25 tokens, context 1531, seed 194179588) 23:24:43-919787 INFO PROMPT= <|im_start|>system Function returned: {BASH_RESULT=[ailice ailice.egg-info AIlice.png Dockerfile IACT.jpg LICENSE log README.md setup.py venv Execution completed. ] }<|im_end|> <|im_start|>user List all files in the current dir.<|im_end|> <|im_start|>assistant !BASH<|im_end|> <|im_start|>user List all files in the current dir.<|im_end|> <|im_start|>assistant Llama.generate: prefix-match hit llama_print_timings: load time = 11984.62 ms llama_print_timings: sample time = 0.12 ms / 11 runs ( 0.01 ms per token, 90909.09 tokens per second) llama_print_timings: prompt eval time = 3626.59 ms / 108 tokens ( 33.58 ms per token, 29.78 tokens per second) llama_print_timings: eval time = 439.92 ms / 10 runs ( 43.99 ms per token, 22.73 tokens per second) llama_print_timings: total time = 4078.59 ms / 118 tokens Output generated in 4.25 seconds (2.35 tokens/s, 10 tokens, context 163, seed 1374320967) 23:24:48-588522 INFO PROMPT= <|im_start|>system Function returned: {BASH_RESULT=[ailice ailice.egg-info AIlice.png Dockerfile IACT.jpg LICENSE log README.md setup.py venv Execution completed. ] }<|im_end|> <|im_start|>user List all files in the current dir.<|im_end|> <|im_start|>assistant !BASH<|im_end|> <|im_start|>assistant !BASH<|im_end|> <|im_start|>user List all files in the current dir.<|im_end|> <|im_start|>assistant Llama.generate: prefix-match hit llama_print_timings: load time = 11984.62 ms llama_print_timings: sample time = 0.12 ms / 11 runs ( 0.01 ms per token, 90909.09 tokens per second) llama_print_timings: prompt eval time = 2768.10 ms / 32 tokens ( 86.50 ms per token, 11.56 tokens per second) llama_print_timings: eval time = 479.88 ms / 10 runs ( 47.99 ms per token, 20.84 tokens per second) llama_print_timings: total time = 3259.56 ms / 42 tokens Output generated in 3.42 seconds (2.92 tokens/s, 10 tokens, context 190, seed 1927657688)

stevenlu137 commented 8 months ago

AIlice use SYSTEM as the role of the function call returned messages, so there will be a lot of SYSTEM message. Seems like oobabooga has discarded all the redundant system messages. You can try to modify the "systemAsUser" parameter of the NousResearch/Nous-Hermes-2-Mixtral-8x7B-DPO model in config.json to True so that system messages use the USER role. This may confuse LLM, but it may solve the problem.

Andryusz commented 8 months ago

Thanks, it seems your analysis was correct and setting systemAsUser to True seems to resolve the issue.

Just a note: I actually tried to change systemAsUser in the config before but AModelChatGPT has a hardcoded formatter with systemAsUser set to False. Maybe it would be a good idea to generalize this class for OpenAI-compatible backends and have a possibility to change formatter/systemAsUser in the config.

stevenlu137 commented 8 months ago

The code has been committed to the master branch. Thank you for your reminder!

myshell-ai / AIlice

Agents looping at task execution #11