[Feature Request]: Multi-agents + Function call + Agent pipeline customization

Reekin commented 6 months ago

Problem Description

目前的对话还比较局限于“问答”“咨询”，在遇到一些任务情境时，需要反复追问AI才能取得一些成效。例如，当我想要知道这个工程中输入框在哪个脚本的哪个位置时，我需要先将工程目录结构截图给它，再告诉他可以指挥我打开特定目录文件以进一步查看，如此反复几次才能找到。但这实际上是一个AI自己就应该能胜任的简单任务。

Solution Description

有没有可能融入AutoAgent的思想，允许在一场对话中让不同的代理相互之间不断对话，并通过用户提供的function call完成user下达的任务？甚至，通过脚本代码或者节点图等形式的json配置，为每个代理定制回复管线（一个Agent内部可能包含多次问答请求，例如：先揣摩user的目标是什么，再评估之前的答复是否已经圆满解决user问题，如果没有满足则再分析应该如何利用手头的tools接近目标...）

让对话不再是一项反复思考如何向AI一步步追问的苦力活，由Agent尽可能自己消化任务。

Alternatives Considered

No response

Additional Context

No response

Issues-translate-bot commented 6 months ago

Bot detected the issue body's language is not English, translate it automatically.

Problem Description

The current dialogue is still relatively limited to "question and answer" and "consultation". When encountering some task situations, the AI needs to be repeatedly questioned to achieve some results. For example, when I want to know where in which script the input box is in this project, I need to first take a screenshot of the project directory structure and then tell it to instruct me to open a specific directory file for further viewing. This can be repeated several times. turn up. But this is actually a simple task that the AI should be able to handle on its own.

Solution Description

Is it possible to incorporate the idea of AutoAgent, allowing different agents to continuously talk to each other in a conversation, and complete tasks assigned by the user through function calls? Even, through json configuration in the form of script code or node graph, the reply pipeline can be customized for each agent (an agent may contain multiple question and answer requests, for example: first figure out what the user's goal is, and then evaluate whether the previous reply has been satisfactory. Solve the user problem, and if it is not satisfied, then analyze how to use the tools at hand to get closer to the goal...)

Let the conversation no longer be a hard job of repeatedly thinking about how to ask the AI step by step. The agent can digest the task by itself as much as possible.

Can you refer to agent frameworks such as agency-swarm?

Alternatives Considered

No response

Additional Context

No response

daiaji commented 6 months ago

现在的llm真的能做到吗？

Issues-translate-bot commented 6 months ago

Bot detected the issue body's language is not English, translate it automatically.

Can the current llm really do it?

Reekin commented 6 months ago

现在的llm真的能做到吗？

不指望它自己能胜任“自己写完一整个工程”这种级别的活，但对于一些比较简单，或者流程比较固定的任务，如果能定制管线，交互流程是可以简化成只需要在开头交代一句话就完事的。

简单任务情景举例：

需要写一些简单python/bat完成的任务，例如“给右键菜单新增选项”“批量文件重命名”等
根据某个流程进行特定的内容创作，管线中可能涉及思考、审阅、洞察、润色等多个不同的agent发挥作用
从代码工程中找到指定功能对应的位置
......

可以理解为面具的超级加强版——面具主要靠prompt实现答复特化，agent增强了可定制化程度和操作范围（function call）

Issues-translate-bot commented 6 months ago

Bot detected the issue body's language is not English, translate it automatically.

Can the current llm really do it?

I don’t expect it to be capable of the level of “writing an entire project by myself”, but for some tasks that are relatively simple or have relatively fixed processes, if the pipeline can be customized, the interaction process can be simplified to just one sentence at the beginning. Just words and that's it.

Example of simple task scenario:

Need to write some simple python/bat tasks, such as "adding options to the right-click menu", "batch file renaming", etc.
Create specific content according to a certain process. The pipeline may involve multiple different agents such as thinking, review, insight, and polishing.
Find the location corresponding to the specified function from the code project *......

It can be understood as a super enhanced version of the mask - the mask mainly relies on prompts to achieve reply specialization, and the agent enhances the degree of customization and operation range (function call)

ChatGPTNextWeb / ChatGPT-Next-Web