fiatrete / OpenDAN-Personal-AI-OS

OpenDAN is an open source Personal AI OS , which consolidates various AI modules in one place for your personal use.
https://opendan.ai
MIT License
1.59k stars 129 forks source link

How to split a complex workflow #40

Open streetycat opened 11 months ago

streetycat commented 11 months ago

I read your pseudocode about workflow, and I found that after receiving a task (AgentMsg), the workflow has two processing methods:

  1. Find a corresponding role to handle the task
  2. All roles process the task separately and combine the results

In reality, many complex tasks need to be completed through division of labor and cooperation, that is, there will be a leader who splits tasks into smaller subtasks and performs them synchronously. This working mode is not found directly in workflow.

After thinking about it, there are roughly two scenarios where tasks need to be split:

  1. Tasks require characters with different skills to complete

Based on the current design, these roles can be constructed in the workflow, and this task can be assigned to all the roles in the workflow to handle it separately. These roles understand and decompose the task, and only deal with the part within their own responsibilities. Finally, each The processing results of roles are combined to become the final result.

  1. A single role has limited capabilities (for LLM, mainly context length?), tasks require many characters with the same skills to complete

Based on the current design, a two-level workflow can be designed. The first level constructs a role and assigns the task to the role. The role understands and splits it into a task list, and delivers the task list to the sub-workflow. The sub-workflow receives the task list. Finally, only the first subtask that has not been completed is processed, and then the status of the subtasks in the task list is modified, and a new sub-workflow is started again until all subtasks in the task list are completed.

flow chart:

graph TB
    subgraph "Workflow(leader)"
        ReceiveMsg["msg = pop_msg()"]-->SplitTask["task_list = leader_role.split_task(msg)"]-->SendToExecutor["result = send(executor_workflow)"]
        WaitResult["result = wait_result()"]
    end

    SendToExecutor.->ReceiveTaskList

    subgraph "Workflow(executor)"
        ReceiveTaskList["task_list = pop_msg()"]-->ExecuteNextTask["result = executor_role.execute(task_list)"]-->IsFinished{"is_finished(result)"}--no-->ReceiveTaskList
        IsFinished--yes-->PostResult["post_result(result)"]
    end

    PostResult.->WaitResult

For the second scenario, it is a bit like a recursive structure, which is more difficult to understand. If you can directly provide a general split mode, it will be more friendly to developers. If you want to achieve this, you may need to introduce a special role (leader ). The general process is as follows:

if self. leader is not None:
     result = await _process_msg(msg, self. leader)
     task_list = split_tasks(result);
     task_result_list = []

     for task in task_list:
         msg = result + ". please execute this task:" + task
         role = self.input_filter.select(msg)
         task_result = await _process_msg(msg, role)
         task_result_list += task_result

     result = self._merge_msg_result(task_result_list)

     chatsession.append_post(result)

In addition, is the process of merging results also an intelligent process of summarization, and does this also require a role to handle?

waterflier commented 10 months ago

I believe we're on the same page. The essence of constructing a team (workflow) composed of AI Agents to decompose complex problems is to confront and solve the enduring fact of the LLM Token Limit. You can see in the current design (src/aios_kernel/workflow.py) that it already includes the process of roles breaking down tasks into sub-workflows and merging them.

I consider this a fundamental requirement, so there's no need to define a role called "Leader" at the base level. However, I also predict that within workflows built on OpenDAN, there will undoubtedly be a frequently used role named "Leader" equipped with a set of sophisticated prompts.

As always, I appreciate your thoughts and feedback on this matter. Let's continue to collaborate to make our project a success.