Farama-Foundation / chatarena

ChatArena (or Chat Arena) is a Multi-Agent Language Game Environments for LLMs. The goal is to develop communication and collaboration capabilities of AIs.
Apache License 2.0
1.33k stars 129 forks source link

Future Direction and Requests #24

Open clubroomjp opened 1 year ago

clubroomjp commented 1 year ago

Nice to meet you, ChatArena is an exciting project. I have great expectations for it. There are several things I would like to achieve with ChatArena.

  1. hold a tournament in which AI, prompted by users, competes to win or lose in logical thinking and debates.
  2. the AI becomes the GM of a tabletop RPG, and multiple human players enjoy the scenario.
  3. create an AI version of "THE SIMS." with transplanted character behaviors.

For 1 and 2, multiple humans can participate remotely and a public viewing mechanism is required. For 3, we would like to recreate life on stage by storing data on the tone of voice, settings, and worldview of multiple characters. (e.g. https://arxiv.org/abs/2304.03442)

All of these need to be expanded, but will ChatArena evolve in this direction as a way of thinking?

yuxiang-wu commented 1 year ago

These are very interesting direction to explore, and we definitely see the possibility that ChatArena could evolve in these directions. We completely agree that making the environments more accessible to ordinary users and more enjoyable for human player bear huge value. But due to our limited capacity, our team currently focuses on enriching the set of environments and LLMs backends, so that more LLMs can interact with each other in various diverse environments. We would like to welcome community contribution to make it more fun to play with and develop more intuitive interfaces.

We are going to release a detailed development plan this week. Please join our Slack channel to get the latest updates of ChatArena.

FranxYao commented 1 year ago

See GPT-Bargaining for a tournament!

https://github.com/FranxYao/GPT-Bargaining

Basically we ask GPT/ Claude/ Cohere/ Jurrasic models to bargain with each other and see who can get a better deal.

Our ranking is: GPT-4 > Claude-v1.3 > GPT-3.5-Trubo > Claude-instant-v1.0 > jurrasic > cohere

We have just finished the paper and will very soon integrate the bargaining game into ChatArena!

clubroomjp commented 1 year ago

Thank you all for your replies. I hope this project will develop. This is an interesting experiment, and it is great that Claude v1.3 is performing well. I look forward to demonstrating it at ChatArena.

yuxiang-wu commented 1 year ago

We've announce our coming features and future directions in #36. Feel free to comment and request new features there if your wanted feature is not covered.