microsoft / autogen

A programming framework for agentic AI 🤖
https://microsoft.github.io/autogen/
Creative Commons Attribution 4.0 International
30.53k stars 4.45k forks source link

[Feature Request]: EmbodiedAgent for allowing AutoGen to control a robot or multiple robots #1239

Closed Tylersuard closed 6 months ago

Tylersuard commented 7 months ago

Is your feature request related to a problem? Please describe.

Currently autogen only works on text, but it has the potential to call functions that could move a robot. For example, raise_arm(30) or move.forward(10).

Describe the solution you'd like

A new file in the agent_chat folder called embodied_agent, with an EmbodiedAgent class, with a few dummy methods for controlling a robot in a simulation or in the real world.

Additional context

I got this idea at CES while seeing some robot dogs. It would be possible to have AutoGen accept speech from a user via a microphone on the robot, and if given functions connected to the robot's firmware, AutoGen could control that robot's movements, especially if it can incorporate images and GPT-V. It would be cool if we had a robot to test this on.

sonichi commented 7 months ago

@AaronWard could you share your prototype and thoughts?

ekzhu commented 7 months ago

This is a cool idea. I think we can do it without creating a new class, i.e., by registering functions to agent.

AaronWard commented 7 months ago

@Tylersuard The only issue with this is that assumes all robots use the same framework. For it to work it would need to be highly generalized. From my understanding ROS (Robot operating system) is the most common framework for controlling robots so that could be a good starting point.

I agree we probably don't need to create a new class @ekzhu, but could probably make a tool that would encapsulate the use of sending requests concurrently to the vision endpoint / completion endpoint (assuming autogen can't already handle concurrency to different endpoints)

From my example:

My robot uses action files in a .d6a format, so through tricky prompting i was able to give it instructions in natural language and it would formulate a plan using a set list of actions, which will then be passed to the UserProxyAgent which would sequentially send the commands to the raspberry pi on the robot. I haven't had time to work on this further but would be interested in this feature request

Tylersuard commented 7 months ago

@AaronWard I love your sarcastic chat agent idea, I made my own too: www.insult-bot.com

I'm glad you tried using AutoGen with a robot, and it looks like it worked out well. @ekzhu My point with adding an embodied agent class is to make users aware that AutoGen can be applied to robots, and maybe to give suggestions for a way to use it. For example, a loop like observe_environment (take a picture), a prompt (like, "You are an embodied agent. To move your left arm, return "LA". To move forward for 10 seconds, return "F10"), and a way for it to respond to commands. Maybe like an interface for recorded audio to send to Whisper, etc. So, making users aware that Autogen can be applied to robots, and making it easier for users to do so. What do you think?

ekzhu commented 7 months ago

Perhaps it is possible to first create a set of Python functions that defines the set of tools the robot can use? Then register these functions to agents so the agent can make decision which tool to use and take feedback from the tool.


robot = ConversableAgent("robot", system_message="You are a robot, use the tools provided to explore the environment...")
environment = UserProxyAgent("environment")

@robot.register_for_llm(description="A tool used to observe the environment")
@environment.register_for_execution()
def observe_environment(...) -> Annotated[Dict, "some explanation of the result"]:
    pass

@robot.register_for_llm(description="A tool used for moving in the environment")
@environment.register_for_execution()
def move(direction: Annotated[str, "direction to move"]) -> Annotated[Dict, "some description of output"]:
    pass

environment.initiate_chat(robot, "Some initiate prompt to the robot")
AaronWard commented 7 months ago

Recently was made aware of this library built on top of Autogen @ekzhu @sonichi @Tylersuard https://github.com/babycommando/machinascript-for-robots

ekzhu commented 7 months ago

Thanks for the heads up. Do you know the people from the project? Perhaps ask them if we can add it to our gallery.

sonichi commented 6 months ago

They can be found from https://discord.com/channels/1153072414184452236/1196311386633015347

AaronWard commented 6 months ago

Upon further inspection the project i realized isn't really a library, even though it's marketed that way (i called this out on the discord). It's just a template that uses autogen - but i haven't seen any running examples of it actually working yet. It all looks very conceptual at the moment but it could be something to add to the gallery in the future

sonichi commented 6 months ago

Upon further inspection the project i realized isn't really a library, even though it's marketed that way (i called this out on the discord). It's just a template that uses autogen - but i haven't seen any running examples of it actually working yet. It all looks very conceptual at the moment but it could be something to add to the gallery in the future

What about your repo https://github.com/AaronWard/generative-ai-workbook/tree/main/personal_projects/19.autogen-robot ? Worth adding to https://microsoft.github.io/autogen/docs/Gallery?

AaronWard commented 6 months ago

@sonichi Sure, i can submit a PR to list the example - thanks!