All-Hands-AI / OpenHands

🙌 OpenHands: Code Less, Make More
https://all-hands.dev
MIT License
32.25k stars 3.69k forks source link

New config: GLOBAL_MAX_ITERATIONS #2121

Closed li-boxuan closed 2 months ago

li-boxuan commented 4 months ago

What problem or use case are you trying to solve?

We now have MAX_ITERATIONS config, which limits the number of turns/steps/iterations an agent could interact with LLM. This is very useful as a safe guard for user's wallet.

With https://github.com/OpenDevin/OpenDevin/pull/1910, an agent could delegate to another agent - every time a delegate agent is instantiated, it would have MAX_ITERATIONS as limit. I have seen a case where my parent agent (due to a bug in prompt engineering) keeps delegating a task to a child agent, who exhausts MAX_ITERATIONS and then gets invoked again in the next turn of parent agent. This means we effectively have MAX_ITERATIONS * MAX_ITERATIONS as the global limit, in the worst case.

We should have another config, e.g. GLOBAL_MAX_ITERATIONS (or maybe FULL_TASK_MAX_ITERATIONS) to control the total number of steps to finish a task.

Describe the UX of the solution you'd like

Do you have thoughts on the technical implementation?

Describe alternatives you've considered

Additional context

li-boxuan commented 4 months ago

Caveat: we also need to cap MAX_ITERATIONS to min(MAX_ITERATIONS, FULL_TASK_MAX_ITERATIONS - accumulated_iterations) every time a parent agent delegates a (sub)task to a new agent.

prithvi2226 commented 4 months ago

Hi @li-boxuan !I would be happy to assist with this challenging task. Your documentation has helped me understand the problem very well. I have a question: Is there a way to address the prompt engineering issue that causes the parent agent to repeatedly delegate tasks? If so, where should I begin looking for the fix in the codebase?

Additionally, if I were to start setting the iterations for the agents, where would you recommend I should begin reading the code in the codebase?

Thank you very much!

li-boxuan commented 4 months ago

@prithvi2226 Please go ahead!

Is there a way to address the prompt engineering issue that causes the parent agent to repeatedly delegate tasks? If so, where should I begin looking for the fix in the codebase

Yes but that's not what this issue is about. And I don't think prompt engineering is enough. We should systematically detect that and rejects the delegate request. That's harder and broader than what this issue is asking for.

if I were to start setting the iterations for the agents, where would you recommend I should begin reading the code in the codebase

Search for every place where MAX_ITERATIONS appears. You would start by copying it and modifying it.

prithvi2226 commented 4 months ago

Thank you for the guidance, @li-boxuan ! I understand that addressing the prompt engineering issue is a broader task. For now, I will focus on implementing the GLOBAL_MAX_ITERATIONS configuration.

I will start by searching for every instance where MAX_ITERATIONS appears in the codebase and work on copying and modifying it to include the new configuration. If I have any further questions or run into any issues, I will reach out.

Thanks again for your help!

prithvi2226 commented 3 months ago

Hi @li-boxuan

Update on Changes Made to Address Global Max Iterations Issue

Objective: Implement a global_max_iterations cap to ensure no agent exceeds a predefined number of iterations, while allowing individual task-specific iterations to be lower if needed.

Summary of Changes:

opendevin/core/schema/config.py: Added global_max_iterations attribute to AppConfig. Updated command-line argument parsing to include global_max_iterations.

opendevin/core/main.py: Ensured the agent respects the global_max_iterations by calculating the effective maximum iterations as the minimum of task-specific and global limits:

max_iterations = min(args.max_iterations, args.global_max_iterations)

Used max_iterations when initializing the AgentController.

opendevin/server/session/agent.py: Passed global_max_iterations from the start event to the agent configuration. Updated agent initialization to respect the global_max_iterations.

tests/integration/regenerate.sh: Added an environment variable for global_max_iterations to ensure it is set during script execution:

global_max_iterations=${global_max_iterations:-50}

_tests/unit/test_argparser.py: Updated tests to include checks for the new global_max_iterations argument.

_tests/unit/testconfig.py: Added tests to verify that global_max_iterations is correctly parsed and applied from both environment variables and configuration files. Ensured global_max_iterations is respected when set from the command line or within configuration files.

opendevin/controller/state/state.py: Updated the State class to handle the global_max_iterations in its attributes and methods, ensuring that any resumed or restored state respects this limit.

_opendevin/controller/agentcontroller.py: Modified delegation logic to ensure that delegated agents also respect the global_max_iterations.

_evaluation/swe_bench/runinfer.py: Updated to parse and respect global_max_iterations from the command-line arguments. Ensured that all task processing within the script respects this global limit.

evaluation/mint/env.py: Modified the environment class to check for both task-specific and global iteration limits. Added methods to calculate and enforce the effective maximum iterations.

Next steps that I would take:

Validation: I wanted to confirm if these changes are correctly implemented and align with the desired behavior. I also wanted to check if the changes ensure that global_max_iterations is enforced across different agents and tasks.

Future Changes: I am going to apply similar updates to the remaining run_infer.py files and prompt.py file. Ensure that any further changes continue to respect the global_max_iterations limit.

Question for Confirmation: Am I on the right track with these changes to address the global_max_iterations issue? Should I proceed with similar updates for the remaining run_infer.py files and prompt.py file? By ensuring these updates, we aim to maintain a consistent and enforceable iteration limit across all agent activities, thereby adhering to the defined global cap. Please review and provide feedback or approval to continue with the remaining changes.

li-boxuan commented 3 months ago

@prithvi2226 lol this reads like AI-generated. Did you use any AI tool to help you generate this summary? Regardless, this is too long of a proposal for a simple task. Please go ahead, create a PR, and let code talk.

prithvi2226 commented 3 months ago

@li-boxuan ! Hahaha Yeah, I had written some notes in my obsidian, and told gpt to rephrase it in a "polite and respectful" way. Btw yeah I will create a pull request later this evening, thanks for making it smoother!

prithvi2226 commented 3 months ago

@li-boxuan ! Hi! I have opened a PR, and would love to know your thoughts on it, any feedback would be appreciated. Thank you very much!

tom-doerr commented 2 months ago

Is GLOBAL_MAX_ITERATIONS really available? Tried for 20mins to use it until I figured out that only MAX_ITERATIONS works for me

li-boxuan commented 2 months ago

Is GLOBAL_MAX_ITERATIONS really available? Tried for 20mins to use it until I figured out that only MAX_ITERATIONS works for me

No, we didn't add a config called GLOBAL_MAX_ITERATIONS. It's just MAX_ITERATIONS that limits the number of iterations per conversation.