This is an experimental project. We are attempting to design a general assistant system that evolves from System1 to System2. The name Sibyl comes from the multi-agent system composed of numerous human brains in Psycho-Pass.
If you find our work useful, please cite our paper:
@article{wang2024sibyl,
title={Sibyl: Simple yet Effective Agent Framework for Complex Real-world Reasoning},
author={Yulong Wang and Tianhao Shen and Lifeng Liu and Jian Xie},
year={2024},
eprint={2407.10718},
archivePrefix={arXiv},
primaryClass={cs.AI},
url={https://arxiv.org/abs/2407.10718},
}
Model Name | Average score (%) | Level 1 score (%) | Level 2 score (%) | Level 3 score (%) |
---|---|---|---|---|
Sibyl System v0.2 | 34.55 | 47.31 | 32.7 | 16.33 |
Multi-Agent Experiment v0.1 (powered by AutoGen) | 32.33 | 47.31 | 28.93 | 14.58 |
FRIDAY | 24.25 | 40.86 | 20.13 | 6.12 |
GPT4 + manually selected plugins | 14.6 | 30.3 | 9.7 | 0 |
GPT4 Turbo | 6.67 | 9.68 | 6.92 | 0 |
AutoGPT4 | 5 | 15.05 | 0.63 | 0 |
Currently popular assistant systems like ChatGPT are designed to solve human decision-making problems at the minute level. Even with methods such as CoT and ReAct, they encounter significant difficulties in handling problems at the 10-minute level. Our system aims to gradually solve problems from the minute level to the hour level and even the day level.
Decoder-only models have a beneficial characteristic of being pure functions, which allows us to better control complexity. However, as we evolve from System1 to System2, the introduction of states inevitably causes the system's complexity to gradually spiral out of control. Some existing Multi-Agent solutions introduce too many states, making the system difficult to scale sustainably. We aim to control this Multi-Agent characteristic within parts of the system or push it to the system's edges.
If you have any inquiries, please feel free to raise an issue or reach out to us via email at: wangyulong@gmail.com, thshen@tju.edu.cn