fiatrete / OpenDAN-Personal-AI-OS

OpenDAN is an open source Personal AI OS , which consolidates various AI modules in one place for your personal use.
https://opendan.ai
MIT License
1.59k stars 129 forks source link

Some issues regarding workflow implementation #39

Closed wugren closed 2 months ago

wugren commented 11 months ago

1.Code: result = await ComputeKernel().do_llm_completion(prompt,the_role.agent.get_llm_model_name(),the_role.agent.get_max_token_size()) Question: ComputeKernel should be a complex scheduling system, it will contain various resources and related interfaces, but here it only depends on llm, it should be more appropriate to encapsulate an llm interface based on ComputeKernel.

2.Code: callchain:CallChain = self._parse_function_call_chain(result) Question: In my understanding, shouldn't a function just be a definite function call? How come it's a CallChain? What is the structure of this CallChain?

3.Code:

next_msg:AgentMsg = self._parse_to_msg(result)
if next_msg is not None:
  # TODO: Next Target can be another role in workflow
  next_workflow:Workflow = self.get_workflow(next_msg.get_target())

Question: Do agents in the same workflow also communicate through send_message? It should be clearer to directly call the agent's interface, and the implementation will be simpler.

4.Code:

inner_chat_session = the_role.agent.get_chat_session(next_msg.get_target(),next_msg.get_session_id())
inner_chat_session.append_post(next_msg)
resp = await next_workflow.send_msg(next_msg)
inner_chat_session.append_recv(resp)

Question: Shouldn't the state of the session be maintained by the owning Agent or workflow? How come it's obtained by the user?

5.Should there be a clear interface to create a session? Whether a new session is needed should only be known by the user.

waterflier commented 11 months ago
  1. The calls in LLM have low IO byte requirements and high computational needs, making it highly suitable for distributed . The results are immediately noticeable once support is added. For an MVP, I think this is an excellent example that demonstrates the entire design of a distributed computing kernel. The complexity seems to be just right.

  2. This is still a matter of granularity between "flexibility and ease of use" in architectural design. I consider that in some environments, users may need to combine multiple function calls. Ideally, they should extend a single function to solve all their needs. However, if we have a call-chain design, the system could support most combination patterns (such as concurrent or sequential invocation of a set of functions), making 80% of the extensions much simpler.

  3. The code doesn't yet include an implementation for sending messages to roles within the workflow.

  4. This example shows that a workflow will manage the messages it sends and receives according to its own rules. How received messages and sent replies are managed is determined by the logic of another workflow. A simple inference would be that a message will usually enter the chat sessions of two different workflows.

  5. The pseudocode doesn't include it, but a unique session is essentially identified by session_owner_id, remote_object_id, and topic_id.