microsoft / autogen

A programming framework for agentic AI 🤖
https://microsoft.github.io/autogen/
Creative Commons Attribution 4.0 International
31.54k stars 4.59k forks source link

It does not work with frontend UI's forced termina lresponses terrible experience #225

Open plusnarrative-phillip opened 11 months ago

plusnarrative-phillip commented 11 months ago

Tech Stack:

Backend: Python (Flask, Socket.IO, AutoGen, OpenAI) Frontend: React Working Parts:

Flask and Socket.IO initialization. Frontend sends messages to backend via /api/chat. User's message processed by user_proxy agent. Agents generate responses. Issues:

Feedback loop: Post-response, the system asks for feedback (likely from AutoGen's user_proxy.initiate_chat). Response Delivery: Backend sends responses via Socket.IO, but frontend doesn't display them. Questions:

How does user_proxy.initiate_chat work? Can we handle or bypass the feedback loop? How should frontend handle/display responses from Socket.IO? How to finalize or confirm agent responses to prevent further prompts? Brief Flow:

Flask, SocketIO, OpenAI initialized. Agents (strategist, innovator, critic, user_proxy) instantiated. On /api/chat, user message sent to user_proxy, agent responses collected, and sent to frontend via Socket.IO.

plusnarrative-phillip commented 11 months ago

``from flask import Flask, request, jsonify from flask_socketio import SocketIO, emit import logging from autogen import AssistantAgent, UserProxyAgent, config_list_from_json import openai import autogen from flask_cors import CORS import json from datetime import datetime

app = Flask(name) CORS(app)

logging.basicConfig(level=logging.INFO) socketio = SocketIO(app, cors_allowed_origins="*")

openai.api_key = "" config_list = config_list_from_json(env_or_file="OAI_CONFIG_LIST") llm_config = {"config_list": config_list, "seed": 42}

strategist = autogen.AssistantAgent( name="strategist", system_message="""Strategist. With vast expertise in .""", llm_config=llm_config )

innovator = autogen.AssistantAgent( name="innovator", system_message="""Innovator. """, llm_config=llm_config )

critic = autogen.AssistantAgent( name="critic", system_message="""Critic. You.""", llm_config=llm_config )

user_proxy = UserProxyAgent("user_proxy") groupchat = autogen.GroupChat(agents=[user_proxy, strategist, innovator, critic], messages=[], max_round=20) manager = autogen.GroupChatManager(groupchat=groupchat, llm_config=llm_config) connected_clients = 0

@socketio.on('connect') def handle_connect(): global connected_clients connected_clients += 1 logging.info(f'Client connected. Total connected clients: {connected_clients}')

# Emitting initial connection message with timestamp
logging.info(f"[{datetime.now()}] Sending initial connection message to frontend")
emit('status', {'timestamp': str(datetime.now()), 'message': 'Connected to the server!'})

# Emitting test message with timestamp
logging.info(f"[{datetime.now()}] Sending test message to frontend")
emit('chat', {'timestamp': str(datetime.now()), 'message': 'This is a test message from the backend.'}, broadcast=True)

@socketio.on('disconnect') def handle_disconnect(): global connected_clients connected_clients -= 1 logging.info(f"[{datetime.now()}] Client disconnected. Total connected clients: {connected_clients}")

@app.route('/api/chat', methods=['GET', 'POST']) def chat(): try:

Log when a request is received

    logging.info(f"[{datetime.now()}] Received a request at /api/chat.")

    user_message = request.json.get('message', '')
    # Log the user's message
    logging.info(f"[{datetime.now()}] User message: {user_message}")

    # Initialize a chat session with the user's message
    user_proxy.initiate_chat(manager, message=user_message)

    # Loop to keep processing feedback and sending all agent responses to the frontend
    while True:
        last_message = manager.groupchat.messages[-1]

        if last_message.sender != user_proxy:
            response_data = {
                'sender': last_message.sender.name,
                'text': last_message.text
            }

            # Log the OpenAI response
            logging.info(f"[{datetime.now()}] OpenAI response: {response_data}")

            # Emit the response data via WebSocket to frontend
            logging.info(f"[{datetime.now()}] Sending OpenAI response to frontend")
            socketio.emit('chat', {
                'timestamp': str(datetime.now()),
                'sender': response_data['sender'],
                'text': response_data['text']
            }, room=request.sid)

        # If we hit a feedback request, auto-reply and continue the loop to get the next message.
        if isinstance(last_message, autogen.FeedbackRequest):
            logging.info(f"[{datetime.now()}] Auto-replying to feedback request.")
            manager.reply_to_feedback_request(auto_reply=True)
        else:
            break

    # Log when message processing is completed
    logging.info(f"[{datetime.now()}] Message processing completed.")
    return jsonify(status="Message processed."), 200

except Exception as e:
    # Log any errors that occur during message processing
    logging.error(f"Error processing the chat request: {e}")
    return jsonify(error=str(e)), 500

if name == 'main': socketio.run(app, debug=True)```

sonichi commented 11 months ago

@victordibia any suggestion?

victordibia commented 11 months ago

Hi @plusnarrative-phillip ,

If you are exploring a socket based UI where the agents respond to requests and the response is send to the UI over a socket, it makes sense that the agents are able to conclude operations without asking for human feedback on each turn.

So, how do you control the "asking for feedback" behavior. Typically, this is done by setting human_input_mode argument.

human_input_mode (str): whether to ask for human inputs every time a message is received.
                Possible values are "ALWAYS", "TERMINATE", "NEVER".
                (1) When "ALWAYS", the agent prompts for human input every time a message is received.
                    Under this mode, the conversation stops when the human input is "exit",
                    or when is_termination_msg is True and there is no human input.
                (2) When "TERMINATE", the agent only prompts for human input only when a termination message is received or
                    the number of auto reply reaches the max_consecutive_auto_reply.
                (3) When "NEVER", the agent will never prompt for human input. Under this mode, the conversation stops
                    when the number of auto reply reaches the max_consecutive_auto_reply or when is_termination_msg is True.

For the UserProxyAgent, this argument set by default set toALWAYS which may be why you keep seeing the constant request for feedback.

I'll suggest setting it to "NEVER"


user_proxy = UserProxyAgent("user_proxy", human_input_mode="NEVER)

Hope this is helpful in some way

bamboriz commented 11 months ago

@victordibia While this makes sense am still not sure how to send and receive messages from the group chat to the frontend. Am particularly talking about a case when u want a user to also participate and hence need human input mode as ALWAYS

victordibia commented 11 months ago

Thanks for the clarification. Your use case is clear and makes sense.

@sonichi … Is there a goog pattern to hook in a function to the human response call back. By default I know it prompts for a command line input … my intuition is that there might be a way to supply a user defined function that can then be hooked into some ui interface. I am not at my desk at the moment but will dig in later.

Get Outlook for iOShttps://aka.ms/o0ukef


From: Boris B. @.> Sent: Saturday, October 14, 2023 11:15:38 AM To: microsoft/autogen @.> Cc: Mention @.>; Comment @.>; Subscribed @.***> Subject: Re: [microsoft/autogen] It does not work with frontend UI's forced termina lresponses terrible experience (Issue #225)

@victordibiahttps://github.com/victordibia While this makes sense am still not sure how to send and receive messages from the group chat to the frontend. Am particularly talking about a case when u want a user to also participate and hence need human input mode as ALWAYS

— Reply to this email directly, view it on GitHubhttps://github.com/microsoft/autogen/issues/225#issuecomment-1763096289 or unsubscribehttps://github.com/notifications/unsubscribe-auth/AALZV7222LSIEXTDVEFMKTDX7LJEVBFKMF2HI4TJMJ2XIZLTSOBKK5TBNR2WLJDUOJ2WLJDOMFWWLO3UNBZGKYLEL5YGC4TUNFRWS4DBNZ2F6YLDORUXM2LUPGBKK5TBNR2WLJDUOJ2WLJDOMFWWLLTXMF2GG2C7MFRXI2LWNF2HTAVFOZQWY5LFUVUXG43VMWSG4YLNMWVXI2DSMVQWIX3UPFYGLLDTOVRGUZLDORPXI6LQMWWES43TOVSUG33NNVSW45FGORXXA2LDOOJIFJDUPFYGLKTSMVYG643JORXXE6NFOZQWY5LFVE3DQMBRGIYDANZRQKSHI6LQMWSWS43TOVS2K5TBNR2WLKRRHE2DCNRWGY2DONNHORZGSZ3HMVZKMY3SMVQXIZI. You are receiving this email because you were mentioned.

Triage notifications on the go with GitHub Mobile for iOShttps://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Androidhttps://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

bamboriz commented 11 months ago

That sounds like something, will look into it. But if anyone finds sth interesting am all for it

sonichi commented 11 months ago

replied in #241

plusnarrative-phillip commented 11 months ago

It honestly makes sense to always use Never if you going to run into the transformer token limit if your context is reached due to sophisticated custom character context that's lengthy.... Then you can just use a final agent called the summarizer and this will be used to summarize everyone's message in full context and added to follow up messages etc... the end user will have a great continuous experience, @victordibia much appreciated I will give it a go. I saw your code but I am not a big types or typescript guy, I know blasphemy lol I am not sure what your strategy is do you normally also use .env files to connect to openai, I assume you don't use this and you have a different strategy that's common practice, I honestly could not get it working, it was clean code refactored well but probably refactored to well where it was hard to find everything when it feels like a simple chat application was fleshed out like a SAAS solution over a basic MVP prototype.

plusnarrative-phillip commented 11 months ago

I can officially say this is just bleeding me dry, its mainly hype. It would be faster to build a system from scratch than to try and get a tool like this to work reliably. Great idea hope you continue to work on it but for now this is an unreliable toy.

from flask import Flask, request, jsonify from flask_socketio import SocketIO, emit import logging from autogen import AssistantAgent, UserProxyAgent, config_list_from_json import openai import autogen from flask_cors import CORS import json from datetime import datetime

app = Flask(name) CORS(app)

logging.basicConfig(level=logging.INFO) socketio = SocketIO(app, cors_allowed_origins="*")

openai.api_key = "" config_list = config_list_from_json(env_or_file="OAI_CONFIG_LIST") llm_config = {"config_list": config_list, "seed": 42}

strategist = autogen.AssistantAgent( name="strategist", system_message="""""", llm_config=llm_config )

innovator = autogen.AssistantAgent( name="innovator", system_message="""""", llm_config=llm_config )

critic = autogen.AssistantAgent( name="critic", system_message="""Critic. .""", llm_config=llm_config )

user_proxy = UserProxyAgent("user_proxy", human_input_mode="NEVER") groupchat = autogen.GroupChat(agents=[user_proxy, strategist, innovator, critic], messages=[], max_round=20) manager = autogen.GroupChatManager(groupchat=groupchat, llm_config=llm_config) connected_clients = 0

@socketio.on('connect') def handle_connect(): global connected_clients connected_clients += 1 logging.info(f'Client connected. Total connected clients: {connected_clients}')

# Emitting initial connection message with timestamp
logging.info(f"[{datetime.now()}] Sending initial connection message to frontend")
emit('status', {'timestamp': str(datetime.now()), 'message': 'Connected to the server!'})

# Emitting test message with timestamp
logging.info(f"[{datetime.now()}] Sending test message to frontend")
emit('chat', {'timestamp': str(datetime.now()), 'message': 'This is a test message from the backend.'}, broadcast=True)

@socketio.on('disconnect') def handle_disconnect(): global connected_clients connected_clients -= 1 logging.info(f"[{datetime.now()}] Client disconnected. Total connected clients: {connected_clients}")

@app.route('/api/chat', methods=['GET', 'POST']) def chat(): try: user_message = request.json.get('message', '').strip()

    if not user_message:  # Check if the message is blank
        return jsonify(status="Empty message received, not processed."), 400

    logging.info(f"[{datetime.now()}] Received message from frontend: {user_message}")

    user_proxy.initiate_chat(manager, message=user_message)
    if user_proxy.expecting_input:
        user_proxy.expecting_input = False
        last_message = manager.groupchat.messages[-1]

        # Filtering out blank responses before processing
        response_text = last_message.text.strip()
        if not response_text: 
            logging.warning(f"[{datetime.now()}] OpenAI returned an empty response. Ignoring.")
            return

        if last_message.sender != user_proxy:
           response_data = {
                    'sender': last_message.sender.name,
                     'text': last_message.text
        }

        # Log OpenAI's response
        logging.info(f"[{datetime.now()}] OpenAI's response: {response_data['text']}")

         # Emit the response data via WebSocket to frontend
        logging.info(f"[{datetime.now()}] Sending OpenAI response to frontend")
        socketio.emit('chat', {
            'timestamp': str(datetime.now()),
            'sender': response_data['sender'],
            'text': response_data['text']
        }, room=request.sid)

    logging.info(f"[{datetime.now()}] Message processing completed.")
    return jsonify(status="Message processed."), 200

except Exception as e:
    logging.error(f"Error processing the chat request: {e}")
    return jsonify(error=str(e)), 500

@socketio.on('chat_event') def handle_chat_event(data): message = data.get('message', '').strip()

if not message:  # Check if the message is blank
    emit('chat', {
        'timestamp': str(datetime.now()),
        'sender': 'server',
        'text': "Received empty message. Ignored."
    })
    return

# Log the received message from the frontend
logging.info(f"[{datetime.now()}] Received message from frontend via WebSocket: {message}")

try:
    user_proxy.initiate_chat(manager, message=message)
    last_message = manager.groupchat.messages[-1]

    if last_message.sender != user_proxy:
        response_data = {
            'sender': last_message.sender.name,
            'text': last_message.text
        }

        # Log OpenAI's response
        logging.info(f"[{datetime.now()}] OpenAI's response: {response_data['text']}")

        # Emit the response data via WebSocket to frontend
        logging.info(f"[{datetime.now()}] Sending OpenAI response to frontend via WebSocket")
        emit('chat', {
            'timestamp': str(datetime.now()),
            'sender': response_data['sender'],
            'text': response_data['text']
        })

    logging.info(f"[{datetime.now()}] Message processing completed via WebSocket.")

except Exception as e:
    logging.error(f"Error processing the chat request via WebSocket: {e}")
    emit('chat', {
        'timestamp': str(datetime.now()),
        'sender': 'server',
        'text': f"Error: {e}"
    })

if name == 'main': socketio.run(app, debug=True) It has a habit of not sending messages to a frontend. It also has a habit of looping, even when you tell it not to, which is terrible. It frequently makes OpenAI API call requests to the point of causing issues, even with blank messages. Additionally, it has a habit of breaking a socket connection when the API exceeds its limits. This is a toy built on a hype that has received more hype than it deserves, at least for now. I love the idea of this tool and hope you continue to invest in it, but it's way faster to build a system that works than to use this tool. I mean, you can build a similar tool using GPT in a day or two

sonichi commented 11 months ago

Thanks for the feedback. To my surprise, there has been product launch built with autogen. I appreciate the high expectation and the community will do the best to meet the expectation. I've seen many successful examples on discord and would like to understand your failure case in more depth. If you like, I can also ask other developers to take a look at your scenario and see if they can help.

victordibia commented 11 months ago

Hi @plusnarrative-phillip ,

Thanks for sharing the feedback above. Some notes.

Thanks again for the feedback

sonichi commented 11 months ago

For loops, we can have a separate discussion. The agents need to have clear instructions about when to terminate and the termination message needs to match the is_termination_msg function. One known issue is that GPT-3.5-turbo often ignores the termination instruction if it's mixed with many other instructions.

plusnarrative-phillip commented 11 months ago

all gravy Thanks for all the feedback, much appreciated. I managed to build my own version in the mean time. Although I still have a lot of hope for this project (Autogen).

You welcome to close this ticket.

BTW you might want to consider message types as a nice touch since it has various approaches to either improved message management, and potentially more efficient llm calls. If a user has a predefined sequence that can help reduce the load, if a user wants the llm to determine who to speak to like in the case of autogen this would be its own thing, if a user want to have a no loop approach (I call this standard ) then you can use a summarizer llm at the end and append this for context, if you want to have the loop functionality their can be two states being human interaction loops or no human interaction loops where it keeps going until the human changes the state to interact as they observe the conversation and only chip in when needed. I made a system like although rudimentary it seems to be a great way forward.

messagetype modes:

The tech is so exciting I hope some of these ideas I have developed help you guys :)

ChristianWeyer commented 11 months ago

Any link to your project @plusnarrative-phillip ?

sonichi commented 11 months ago

@plusnarrative-phillip thanks for the discussion. A few tips regarding the types of communications you mentioned:

Hope this helps and look forward to seeing what you build 🏗️

khawajaJunaid commented 9 months ago

@plusnarrative-phillip did you have your own implementation of user proxy because when i run the code you shared above it errors out ERROR:root:Error processing the chat request: 'UserProxyAgent' object has no attribute 'expecting_input'

dsaha21 commented 6 months ago

Hi, so I am also facing a similar problem regarding sending each bot response that I will receive in frontend. My conversation pattern is sequential, the reason is I have to ask the AssisantAgent a set of about 20 questions about a sign up process.. For more clarity, please view this discussion https://github.com/microsoft/autogen/discussions/2047

Therefore, is there a way to send data to frontend and get data to backend so that it can be used as a chatbot purpose. Please let me know. Thank you