microsoft / autogen

A programming framework for agentic AI 🤖
Creative Commons Attribution 4.0 International
30.98k stars 4.52k forks source link

Autogen Assistant app can't work with UserProxy's human_input_mode='ALWAYS' #813

Open miiiz opened 9 months ago

miiiz commented 9 months ago

It seems that the whole app can only work with UserProxy's human_input_mode='NEVER'. if UserProxy's human_input_mode set to be 'ALWAYS', there is no way to enter human feedback in the frontend(can enter in the shell). only when the conversation is terminated, one can enter another input in the frontend. that means there is no chats around between UserProxy and AssistantAgent. only one time conversation? Thank you.

julianakiseleva commented 9 months ago

@sonichi can you please clarify this

ruifengma commented 9 months ago

Yes, I have this issue as well, it seems that the autogenra do not support human in the loop mode and each time I should type in the command line not the UI side, otherwize the process will freeze.

victordibia commented 9 months ago

Yes, at the moment this requires some streaming infrastructure which is on the roadmap.

hughlv commented 8 months ago

I introduced a hack to solve this issue in my autogenra-alike project.

Can refer to the custom_get_human_input here in this Jinja2 template:

The generated code would like following. The caller only need to handle stdout and stdin for output/input, could also refer to

# This file is auto-generated by [FlowGen](
# Last generated: 2023-12-28 16:26:17
# Flow Name: Simple Chat
# Description: 
ChatGPT-alike Simple Chat involves human.


This flow has set the Human Input Mode to ALWAYS. That means whenever receive a message from Assistant, you as user need to respond by typing message in the chatbox. 'exit' will quit the conversation.

1. Send a simple message such as `What day is today?`.
2. If need to quit, send 'exit'
3. Sometimes Assistant will send back some code ask you to execute, you can simply press Enter and the code will be executed.

from dotenv import load_dotenv
load_dotenv()  # This will load all environment variables from .env

import argparse
import os
import time
from termcolor import colored

# Parse command line arguments
parser = argparse.ArgumentParser(description='Start a chat with agents.')
parser.add_argument('message', type=str, help='The message to send to agent.')
args = parser.parse_args()

import autogen

# openai, whisper and moviepy are optional dependencies, currently only used in video transcript example
# However, we beleive they are useful for other future examples, so we include them here as part of standard imports
from openai import OpenAI
import whisper
from moviepy.editor import VideoFileClip

from IPython import get_ipython

from autogen import AssistantAgent
from autogen import UserProxyAgent

# Replace the default get_human_input function for status control
def custom_get_human_input(self, prompt: str) -> str:
    # Set wait_for_human_input to True
    print('__STATUS_WAIT_FOR_HUMAN_INPUT__', prompt, flush=True)
    reply = input(prompt)
    # Restore the status to running
    print('__STATUS_RECEIVED_HUMAN_INPUT__', prompt, flush=True)
    return reply

autogen.ConversableAgent.get_human_input = custom_get_human_input

config_list = autogen.config_list_from_json(
        "model": ["gpt-4-1106-preview", "gpt-4-vision-preview"],

llm_config = {
    "config_list": config_list,
    "temperature": 0.5,
    "max_tokens": 1024,

node_vbhhpjj8xo = AssistantAgent(

user_proxy = UserProxyAgent(
    system_message="Hello AI",
    is_termination_msg=lambda x: x.get("content", "") and x.get("content", "").rstrip().endswith("TERMINATE"),
      "work_dir": "work_dir",
# Function template content generator
# register the functions

# Start the conversation