Could you introduce the difference between GhostOS and AIOS?

jojoyu commented 1 month ago

AIOS info: https://github.com/agiresearch/AIOS https://arxiv.org/abs/2312.03815

thirdgerb commented 1 month ago

Thank you for your question. I just finished a vacation and I'm sorry for not replying in time. It seems that AIOS focuses on the OS, but provides an agent as its interaction interface, and the main user of AIOS is human. On the other hand, GhostOS focuses on the Ghost (Agent), and provides a Python code interface for the Agent to operate its OS. In this OS, various capabilities/tools (such as terminal, file management, etc.) can be provided to the Agent through Python libraries. Ideally, the Meta-Agent can also encapsulate the libraries it wants.

NileZhou commented 1 month ago

First, thanks for your question sincerely.

In my view, one of the core capabilities of GhostOS (just a mini subset):

Generate and execute workflow dynamically thorugh multi-step coding:

If you want llm to handle a complex task (like debugging a project)

what you define (an single AIFunc to solve this problem):
```
# import a lot of things
```

class DebugAgentFn(AIFunc): request: str = Field(default="", description="raw request for the agent")

class DebugAgentFnResult(AIFuncResult): issue_culprit_file_parts: Set[CulpritFilePart] = Field(default=[], description="the file parts that caused the issue") err: Optional[str] = Field(default=None, description="error message")

...


There is no body/methods of xxxAIFunc or xxxAIFuncResult !
You don't need to implemet the AIFunc, just defined it !

2. what llm see (a Python code interface, generated by GhostOS/MOSS):
```text
...
You are an AIFunc named DebugAgentFn. AIFunc is a LLM-driven function that  could complete request by multi-turns thinking, And your purpose is to generate AIFuncResult DebugAgentFnResult as final result.

# code let llm know the context (generated by GhostOS/MOSS)

what llm generated

<code>
  def main(moss: Moss, fn: DebugAgentFn):
      # Define useful constants
      ...

      # Define the directory to start exploration
      ...

      # Initialize the FileContentOperations
      ...

      # Define the search string from the issue description
      ...

      # Explore each file containing the search string
      for file_path in files_containing_string:
          if cur_step >= max_steps:
              break

          # Read the content of the file
          file_content = file_operations.read_file(file_path)

          # Create an instance of FileExplorerAIFunc to analyze the file
          file_explorer = FileExplorerAIFunc(
              cur_object=file_path,
              debug_task_ctx=fn.debug_task_ctx,
              file_content=file_content
          )

          # Execute the FileExplorerAIFunc
          file_explorer_result = moss.ai_func_ctx.run(f"explore_{file_path}", file_explorer)

          if file_explorer_result.confidence_percentage >= confidence_percentage_requirement:
              for culprit in file_explorer_result.found_culprits:
                  found_culprits.add(culprit)

          cur_step += 1

      # Create the result object
      result = DebugAgentFn(
          ......
      )

      return result, True
  </code>

Attention, when LLM invoke FileExplorerAIFunc (it's a new AIFunc) , the whole program will enter next func stack frame (the AIFunc even not be implemented FIleExploreAIFunc before!)

executed code the pregenerated code context(generated and managed by MOSS) + main code, then execute the main code
what's next llm see (enter the next frame of ) The result of FileExplorerAIFunc
what's next llm generated next FIleExplorerAIFunc maybe different
what's next executed code ....

From this example, when LLM want invoke tool/libraries, or plan, it using pure code (so it can generated workflow dynamiclly, the workflow is turing-complete). But in some times, LLM can't figure the detail execution out in just ONE turn, it will use some AIFunc which has no body, when code interpreter executing this line, GhostOS will let LLM to implement it first (might contains next AIFunc, so it can generate nest-workflow).

The difficult of implementing this example, is how we implement a mechanism like stack frame of programming language. Execute an AIFunc is like push frame in stacktrace, execute finished is like pop out from stacktrace. And the more difficult: how we maintain the correct code context and feed it into LLM in every frame(every turn of conversations). We resolved these challenges in MOSS/GhostOS

In general, GhostOS is not an OS for human, it focus on code-driven agent(This agent is far more stronger than traditional LLM + Code Interpreter in human-crafted workflow). Our vision and intention for this project: A LLM operate(coding) many things through prompts generated by GhostOS, just like a human operate many things through GUI generated by Windows/MacOS.

The correct category/tag of this project might be Code Agent + Code Agent's Environment + Code Agent's Evolving mechanism.

ghost-in-moss / GhostOS

Could you introduce the difference between GhostOS and AIOS? #37