Closed li-boxuan closed 2 months ago
Caveat: we also need to cap MAX_ITERATIONS to min(MAX_ITERATIONS, FULL_TASK_MAX_ITERATIONS - accumulated_iterations) every time a parent agent delegates a (sub)task to a new agent.
Hi @li-boxuan !I would be happy to assist with this challenging task. Your documentation has helped me understand the problem very well. I have a question: Is there a way to address the prompt engineering issue that causes the parent agent to repeatedly delegate tasks? If so, where should I begin looking for the fix in the codebase?
Additionally, if I were to start setting the iterations for the agents, where would you recommend I should begin reading the code in the codebase?
Thank you very much!
@prithvi2226 Please go ahead!
Is there a way to address the prompt engineering issue that causes the parent agent to repeatedly delegate tasks? If so, where should I begin looking for the fix in the codebase
Yes but that's not what this issue is about. And I don't think prompt engineering is enough. We should systematically detect that and rejects the delegate request. That's harder and broader than what this issue is asking for.
if I were to start setting the iterations for the agents, where would you recommend I should begin reading the code in the codebase
Search for every place where MAX_ITERATIONS
appears. You would start by copying it and modifying it.
Thank you for the guidance, @li-boxuan ! I understand that addressing the prompt engineering issue is a broader task. For now, I will focus on implementing the GLOBAL_MAX_ITERATIONS configuration.
I will start by searching for every instance where MAX_ITERATIONS appears in the codebase and work on copying and modifying it to include the new configuration. If I have any further questions or run into any issues, I will reach out.
Thanks again for your help!
Hi @li-boxuan
Update on Changes Made to Address Global Max Iterations Issue
Objective: Implement a global_max_iterations cap to ensure no agent exceeds a predefined number of iterations, while allowing individual task-specific iterations to be lower if needed.
Summary of Changes:
opendevin/core/schema/config.py: Added global_max_iterations attribute to AppConfig. Updated command-line argument parsing to include global_max_iterations.
opendevin/core/main.py: Ensured the agent respects the global_max_iterations by calculating the effective maximum iterations as the minimum of task-specific and global limits:
max_iterations = min(args.max_iterations, args.global_max_iterations)
Used max_iterations when initializing the AgentController.
opendevin/server/session/agent.py: Passed global_max_iterations from the start event to the agent configuration. Updated agent initialization to respect the global_max_iterations.
tests/integration/regenerate.sh: Added an environment variable for global_max_iterations to ensure it is set during script execution:
global_max_iterations=${global_max_iterations:-50}
_tests/unit/test_argparser.py: Updated tests to include checks for the new global_max_iterations argument.
_tests/unit/testconfig.py: Added tests to verify that global_max_iterations is correctly parsed and applied from both environment variables and configuration files. Ensured global_max_iterations is respected when set from the command line or within configuration files.
opendevin/controller/state/state.py: Updated the State class to handle the global_max_iterations in its attributes and methods, ensuring that any resumed or restored state respects this limit.
_opendevin/controller/agentcontroller.py: Modified delegation logic to ensure that delegated agents also respect the global_max_iterations.
_evaluation/swe_bench/runinfer.py: Updated to parse and respect global_max_iterations from the command-line arguments. Ensured that all task processing within the script respects this global limit.
evaluation/mint/env.py: Modified the environment class to check for both task-specific and global iteration limits. Added methods to calculate and enforce the effective maximum iterations.
Next steps that I would take:
Validation: I wanted to confirm if these changes are correctly implemented and align with the desired behavior. I also wanted to check if the changes ensure that global_max_iterations is enforced across different agents and tasks.
Future Changes: I am going to apply similar updates to the remaining run_infer.py files and prompt.py file. Ensure that any further changes continue to respect the global_max_iterations limit.
Question for Confirmation: Am I on the right track with these changes to address the global_max_iterations issue? Should I proceed with similar updates for the remaining run_infer.py files and prompt.py file? By ensuring these updates, we aim to maintain a consistent and enforceable iteration limit across all agent activities, thereby adhering to the defined global cap. Please review and provide feedback or approval to continue with the remaining changes.
@prithvi2226 lol this reads like AI-generated. Did you use any AI tool to help you generate this summary? Regardless, this is too long of a proposal for a simple task. Please go ahead, create a PR, and let code talk.
@li-boxuan ! Hahaha Yeah, I had written some notes in my obsidian, and told gpt to rephrase it in a "polite and respectful" way. Btw yeah I will create a pull request later this evening, thanks for making it smoother!
@li-boxuan ! Hi! I have opened a PR, and would love to know your thoughts on it, any feedback would be appreciated. Thank you very much!
Is GLOBAL_MAX_ITERATIONS
really available? Tried for 20mins to use it until I figured out that only MAX_ITERATIONS
works for me
Is
GLOBAL_MAX_ITERATIONS
really available? Tried for 20mins to use it until I figured out that onlyMAX_ITERATIONS
works for me
No, we didn't add a config called GLOBAL_MAX_ITERATIONS
. It's just MAX_ITERATIONS
that limits the number of iterations per conversation.
What problem or use case are you trying to solve?
We now have MAX_ITERATIONS config, which limits the number of turns/steps/iterations an agent could interact with LLM. This is very useful as a safe guard for user's wallet.
With https://github.com/OpenDevin/OpenDevin/pull/1910, an agent could delegate to another agent - every time a delegate agent is instantiated, it would have MAX_ITERATIONS as limit. I have seen a case where my parent agent (due to a bug in prompt engineering) keeps delegating a task to a child agent, who exhausts MAX_ITERATIONS and then gets invoked again in the next turn of parent agent. This means we effectively have MAX_ITERATIONS * MAX_ITERATIONS as the global limit, in the worst case.
We should have another config, e.g. GLOBAL_MAX_ITERATIONS (or maybe FULL_TASK_MAX_ITERATIONS) to control the total number of steps to finish a task.
Describe the UX of the solution you'd like
Do you have thoughts on the technical implementation?
Describe alternatives you've considered
Additional context