[Feature Request]: Agent which can intelligently ask for human input

microsoft / autogen

A programming framework for agentic AI 🤖

Creative Commons Attribution 4.0 International

30.73k stars 4.48k forks source link

Is your feature request related to a problem? Please describe.

Lets suppose if we have 2 AssistantAgent agents (with human_input_mode="NEVER") who are trying to solve a programming problem in pair programming manner. There are times when both of them are trying to solve some solution but it doesnt work. It just keeps throwing errors, and the agents just dont get a working solution. It would be a great value add, if the llm groupchat intelligently figures out that its time to ask for some human input, to reach proper solution. Is there something like this already in Autogen ? or we need to be doing something else ?

Describe the solution you'd like

It would be great if there is some intermediate value for human_input_mode between ALWAYS and NEVER, which can intelligently ask for human input only when necessary ?

Or do we need to create a AssistantAgent (A1), add it in the groupchat along with the other 2 AssistantAgents (P1, P2) doing pair programming, whose job is to look at the performance of the other 2 AssistantAgent doing pair programming, and when it sees that there's no good progress, groupchat llm decides to call AssistantAgent (A1) which asks for Human input to show some direction to the chat towards solving the problem ?

Additional context

No response

One way to do that, i.e., intelligently figures out that it's time to ask for some human input, now is to use function call. You can write an ask_human_expert function, which just prints the current results and solicits human input, and let LLM decide when to use that function intelligently by registering the function to one or both agents. Find a function call examples here: https://github.com/microsoft/autogen/blob/main/notebook/agentchat_function_call.ipynb and https://github.com/microsoft/autogen/blob/main/notebook/agentchat_two_users.ipynb. There is an ask_expert function, which is conceptually similar to the suggested ask_human_expert, in the latter notebook.

Regarding the suggested solution: The three-agent group chat solution is an interesting idea and presumably should work as well. It's worth trying.

For the support of intermediate value for human_input_mode between ALWAYS and NEVER: we have another supported value "TERMINATE". It is still not that dynamic though. It is indeed potentially useful to support a more dynamic mode where LLM decides when to solicit human inputs. This requires revising the check_termination_and_human_reply function in the ConversableAgent class.

@krishnashed let me know if what I said is clear to you, especially the function call idea. Thank you!

assistant = ConversableAgent(name="assistant", system_message=f"You are an AI assistant. You can ask for help from a human expert using the ask_human_expert function.") user_proxy = UserProxyAgent(name="user proxy") @assistant.register_for_llm(description="Function for asking human expert.") @user_proxy.register_for_execution() def ask_human_expert(question: Annotated[str, "The question you want to ask the human expert."]) -> Annotated[str, "Answer"]: answer = input(f"Please answer the question: {question}\n") return answer user_proxy.initiate_chat(assistant, "Hey what is the age of the bedrock in grand canyon?")

microsoft / autogen