xlang-ai / OpenAgents

[COLM 2024] OpenAgents: An Open Platform for Language Agents in the Wild
https://arxiv.org/abs/2310.10634
Apache License 2.0
4k stars 443 forks source link
agent assistant-chat-bots code-generation executable-langauge-grounding gpt hacktoberfest language-model language-model-agent llm semantic-parsing tool-learning ui

OpenAgents: An Open Platform for Language Agents in the Wild

OpenAgents Paper

Online Demos

XLangNLPLab

User Manual

License: apache-2-0

GitHub Stars

Open Issues

Twitter Follow

Join Slack

Discord

English中文日本語한국어

Current language agent frameworks aim to facilitate the construction of proof-of-concept language agents while neglecting the non-expert user access to agents and paying little attention to application-level designs. We built OpenAgents, an open platform for using and hosting language agents in the wild of everyday life.

We have now implemented three agents in OpenAgents, and we host them on demo for free use!

  1. Data Agent for data analysis with Python/SQL and data tools;
  2. Plugins Agent with 200+ daily tools;
  3. Web Agent for autonomous web browsing.

OpenAgents can analyze data, call plugins, control your browser as ChatGPT Plus, but with OPEN Code for

  1. Easy deployment
  2. Full stack
  3. Chat Web UI
  4. Agent methods

OpenAgents enables general users to interact with agent functionalities through a web UI optimized for swift responses and common failures, while offering developers and researchers a seamless deployment experience on local setups, providing a foundation for crafting innovative language agents and facilitating real-world evaluations. We elucidate both the challenges and promising opportunities, aspiring to set a foundation for future research and development of real-world language agents.

We welcome contributions from everyone. Before you start, please take a moment to read our CONTRIBUTING.md guidelines for issues and PRs. This will help ensure that your contribution process is smooth and consistent with the project’s standards.

🔫 Trouble Shooting

Join our Discord for help if you encounter any issues with our online demo or local deployment. Alternatively, create an issue if you have trouble with features or code.

🔥 News

🥑 OpenAgents

We built three real-world agents with chat-based web UI as demonstration(check OpenAgents demos). Here is a brief overview of our OpenAgents platform. You can find more details about concepts & designs in our documentation.

Data Agent

Data Agent is a comprehensive toolkit designed for efficient data operations. It provides capabilities to:

With its proficiency in writing and executing code, Data Agent simplifies a wide range of data-centric tasks. Discover its potential through various use cases.

Click to see more use case screenshots

Plugins Agent

Plugins Agent seamlessly integrates with over 200 third-party plugins, each handpicked to enrich various facets of your daily life. With these plugins at its disposal, the agent empowers you to tackle a wide range of tasks and activities more efficiently.

🔌 Sample Plugins Include:

Combined Plugin Usage

Harness the power of synergy! Plugins Agent supports the concurrent use of multiple plugins. Planning a trip? Seamlessly integrate functionalities from Klook, Currency converter, and WeatherViz.

Auto Plugin Selection

Simplify your choices with our Auto Plugin Selection feature. Let the agent intuitively search and suggest the best plugins tailored to your needs.

Dive into more use cases to see Plugins Agent in action.

Click to see more use case screenshots

Web Agent

Web Agent harnesses the power of a Chrome extension to navigate and explore websites automatically. This agent streamlines the web browsing experience, making it easier to find relevant information, access desired resources, and so on.

Examples of What Web Agent Can Do:

Witness the full potential of Web Agent in these use cases.

Click to see more use case screenshots

💻 Localhost Deployment

We've released the OpenAgents platform code. Feel free to deploy on your own localhost!

Here is a brief system design of OpenAgents:

From Source Code

Please check the following folders and README files to set up & localhost:

  1. Backend: the flask backend to host our three agents.
  2. Frontend: the frontend UI and WeBot Chrome extension.

p.s.: We have renamed some arguments in code for better readability. If you have pulled the code before 10/26/2023, just a reminder that if you want to you pull the latest code, previous local chat history will be lost because of different key names.

Docker

Please follow the following steps to use the docker-compose to deploy the OpenAgents platform.

Note: the docker is under development, so there may be functions not working properly as expected and slower response. Please feel free to open an issue if you have any questions. If you want a more robust version, currently we recommend you to deploy from source code.

  1. If you want to use kaggle's dataset, you must modify the information in the Dockerfile to your correct information.
    ENV KAGGLE_USER="" \
    KAGGLE_KEY="" 
  2. If you are not running locally, you need to modify the accessible IP to the backend service in frontend/Dockerfile
    ENV NEXT_PUBLIC_BACKEND_ENDPOINT http://x.x.x.x:8000
  3. Run the docker compose build command in the project root directory.
  4. If you use openai unofficial services, such as FastChat, you need to modify OPENAI_API_BASE in docker-compose.yml;otherwise you only to put your OPENAI_API_KEY in docker-compose.yml
  5. After completing the above steps, you can run docker compose up -d to start all services.

Notice:

  1. If you want to use GPU, you need install Nvidia Container Toolkit,and uncomment the the docker-compose.yml Lines 56-62.
  2. Use Auto Plugin will download the weight file from huggingface. In some areas, connection timeout may occur. Please solve the network problem by yourself.

📜 Tutorial on Extending OpenAgents

Code Structure

Before we dive into how to extend OpenAgents, let's first take a glance at the code structure for better understanding. The code structure of OpenAgents is shown below:

├── backend  # backend code
│   ├── README.md  # backend README for setup
│   ├── api  # RESTful APIs, to be called by the frontend
│   ├── app.py  # main flask app
│   ├── display_streaming.py  # rendering the streaming response
│   ├── kernel_publisher.py  # queue for code execution
│   ├── main.py  # main entry for the backend
│   ├── memory.py  # memory(storage) for the backend
│   ├── schemas.py  # constant definitions
│   ├── setup_script.sh  # one-click setup script for the backend
│   ├── static  # static files, e.g., cache and figs
│   └── utils  # utilities
├── frontend  # frontend code
│   ├── README.md  # frontend README for setup
│   ├── components  # React components
│   ├── hooks  # custom React hooks
│   ├── icons  # icon assets
│   ├── next-env.d.ts  # TypeScript declarations for Next.js environment variables
│   ├── next-i18next.config.js  # configuration settings for internationalization
│   ├── next.config.js  # configuration settings for Next.js
│   ├── package-lock.json  # generated by npm that describes the exact dependency tree
│   ├── package.json  # manifest file that describes the dependencies
│   ├── pages  # Next.js pages
│   ├── postcss.config.js  # configuration settings for PostCSS
│   ├── prettier.config.js  # configuration settings for Prettier
│   ├── public  # static assets
│   ├── styles  # global styles
│   ├── tailwind.config.js  # configuration settings for Tailwind CSS
│   ├── tsconfig.json  # configuration settings for TypeScript
│   ├── types  # type declarations
│   ├── utils  # utilities or helper functions
│   ├── vitest.config.ts  # configuration settings for ViTest
│   └── webot_extension.zip  # Chrome extension for Web Agent
└── real_agents  # language agents
    ├── adapters  # shared components for the three agents to adapt to the backend
    ├── data_agent  # data agent implementation
    ├── plugins_agent  # plugins agent implementation
    └── web_agent  # web agent implementation

As shown, backend/ and frontend/ are self-contained and directly deployable (see here). It does not mean they cannot be modified. Instead, you can just follow the conventional client-server architecture to extend the backend and frontend as you wish. For real_agents/, we design it to be "one agent, one folder", so that it is easy to extend a new agent. It is worth noting that we name it "real agents" because not only the conceptual language agent part is included, but also the gaps between the language agent and the backend are filled here. For example, adapters/ contains the shared adapter components like stream parsing, data model, memory, callbacks, etc. We refer interested readers to our paper for concepts and implementation designs. And we thank LangChain as we base on their code to build real agents.

Extend A New Agent

If you want to build a new agent beyond the three agents we provide, you can follow the steps below:

Note, if new data types, i.e., beyond text, image, table, and json, you may need to implement its parsing logic in backend/display_streaming.py and add new data models.

Extend A New LLM

Extending a new LLM as the agent backbone is simpler if the LLM is already hosted and can be called via API. Just register your new model in backend/api/language_model.py. Just refer to lemur-chat as a template.

If the LLM is not hosted yet, we have a tutorial on how to deploy a new LLM and expose it as an API [here]() (LLM hosting to todo).

Extend A New Tool

If you want to extend a new tool in Plugins Agent, you can follow the steps below:

👏 Contributing

Thanks to open-sourced communities’ efforts, such as LangChain, ChatBot UI, Taxy.ai browser extension and others. We are able to build our interface prototype much more conveniently and efficiently.

We welcome contributions and suggestions, together we move further to make it better! Following the steps will be well-received:

Before you start, we highly recommend taking a moment to check here before contribution.

📖 Documentation

Please check here for full documentation, which will be updated to stay on pace with the demo changes and the code release.

🧙‍Participants

Tech Lead

Co-Lead Contributors

Key Contributors

Valuable Contributors

Acknowledgments (beyond code)

Heartfelt appreciation to Ziyi Huang, Roxy Rong, Haotian Li, Xingbo Wang, Jansen Wong, and Chen Henry Wu for their valuable contributions to the OpenAgents. Their expertise and insights were instrumental in bringing this project to fruition!

Open Source Contributors

Thanks to all the contributors!

Citation

If you find our work helpful, please cite us:

@misc{OpenAgents,
      title={OpenAgents: An Open Platform for Language Agents in the Wild}, 
      author={Tianbao Xie and Fan Zhou and Zhoujun Cheng and Peng Shi and Luoxuan Weng and Yitao Liu and Toh Jing Hua and Junning Zhao and Qian Liu and Che Liu and Leo Z. Liu and Yiheng Xu and Hongjin Su and Dongchan Shin and Caiming Xiong and Tao Yu},
      year={2023},
      eprint={2310.10634},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

Acknowledgments

We would like to thank Google Research, Amazon AWS, and Salesforce Research for their research gift funds to this open-source effort!

Salesforce Research Google Research Amazon AWS

⭐️ Star History

Star History Chart

A ⭐️ to OpenAgents is to make it shine brighter and benefit more people.