daveshap / OpenAI_Agent_Swarm

HAAS = Hierarchical Autonomous Agent Swarm - "Resistance is futile!"
MIT License
2.95k stars 381 forks source link

Add OpenApiFunctionHelper for Enhanced API Integration and Spec Processing #164

Closed ph47s74x closed 6 months ago

ph47s74x commented 8 months ago

Summary

This pull request introduces the OpenApiFunctionHelper class, a tool designed to transform OpenAPI specifications into a format suitable for OpenAI assistant integrations. The addition of this utility is aimed at enhancing our system's ability to interact with a wide range of APIs more efficiently and robustly.

Key Features

Impact

Integrating OpenApiFunctionHelper into our codebase enables us to significantly enhance the capabilities of our agents. It paves the way for a more flexible and robust API interaction framework, which can adapt to a variety of use cases and improve the overall effectiveness of our system.

Usage

An example of how OpenApiFunctionHelper can be utilized is included in the code documentation. This demonstrates the process of loading, processing, and converting an OpenAPI spec into a format compatible with OpenAI assistants.

Next Steps

Post-merge, the next steps will involve integrating this functionality with our existing agents and starting the process of building out our API library. This will be a collaborative effort to ensure smooth integration and to maximize the utility of the new features.

I look forward to your feedback and am available for any further discussions or changes required for this integration.

Gottheil commented 8 months ago

Hey,

Forgive me if I am missing the full scope of the feature, I think I was not able to understand it fully.

I'll try to have a go:

  1. Does this make it easier to create OpenAI API requests? If so, what is wrong with the basic implementation of the API:

assistant = client.beta.assistants.create(
name="Math Tutor",
instructions="You are a personal math tutor. Write and run code to answer math questions.",
tools=[{"type": "code_interpreter"}],
model="gpt-4-1106-preview"
)

  1. Does this help with integrating 3rd party APIs? If so, I would argue that the final low level capabilities of the swarm are not yet sufficiently definable. First we will have to proof the inner structure. I fear that early commitment to a standard of handeling the outside world might backfire if for some reason the current architecture needs to be adapted etc.

Again, I am really sorry if I simply misunderstood the feature and all i mean to do is understand and possibly spark conversation.

Cheers!

djgilcrease commented 8 months ago

Based on reading this, if you create an assistant and pass it the functions parsed from open_ai_oas.yaml then that assistant would have everything it needs to create additional assistants, upload/download files for the new assistant, and fully utilize the the assistant. So this appears to be a step toward agents having the ability to create other agents.

ph47s74x commented 7 months ago

Video Demonstration; https://youtu.be/ZmLXaNLkhO4

In my latest video, I showcase an extension to my code that transforms any OpenAPI schema into a dynamic tool for an AI assistant. This approach allows for semi-autonomous interactions with APIs, demonstrated using the OpenAI OpenAPI Schema.

The process begins with natural language instructions, initiating a nuanced interaction sequence. Here are the specific instructions issued:

  1. Retrieve the list of available models and select "gpt-4-1106-preview" as the model to use.
  2. Create three OpenAI assistants using the latest model:
    • An "Assistant Creator" using a tool code interpreter.
    • An "Image Creator" equipped with functions for interacting with DALL-E.
    • A "Completions Assistant" for text completion tasks.
  3. Retrieve all available assistants.
  4. Generate a unique image for each assistant.
  5. Assign the created images to each respective assistant.

Following these instructions, the AI autonomously makes the necessary API calls, with decisions rooted in the function calls derived from the OpenAPI schema. The AI successfully completes the tasks and responds with:

  1. Successfully retrieved available models and selected "gpt-4-1106-preview".
  2. Created three specialized OpenAI assistants:
    • "Assistant Creator" with a code interpreter tool.
    • "Image Creator" for DALL-E interactions.
    • "Completion Assistant" for text completions.
  3. Retrieved a comprehensive list of available assistants.
  4. Generated distinct images for each assistant, symbolizing their unique functionalities.
  5. Set the newly created images for each assistant accordingly.

This method introduces a highly flexible API interface. By using the OpenAPI schema and natural language, we can seamlessly instruct the AI, eliminating the need for code rewrites—only schema updates are required for adaptations. It represents a more fluid and adaptable way of API interaction, significantly reducing development overhead and enhancing the AI's ability to execute tasks autonomously.