ghost-in-moss / GhostOS

An agent framework offering a Python code interface for LLM-driven agents and meta-agents to do everything by code generation.
MIT License
51 stars 3 forks source link

Could you introduce the difference between GhostOS and AIOS? #37

Closed jojoyu closed 1 month ago

jojoyu commented 1 month ago

AIOS info: https://github.com/agiresearch/AIOS https://arxiv.org/abs/2312.03815

thirdgerb commented 1 month ago

Thank you for your question. I just finished a vacation and I'm sorry for not replying in time. It seems that AIOS focuses on the OS, but provides an agent as its interaction interface, and the main user of AIOS is human. On the other hand, GhostOS focuses on the Ghost (Agent), and provides a Python code interface for the Agent to operate its OS. In this OS, various capabilities/tools (such as terminal, file management, etc.) can be provided to the Agent through Python libraries. Ideally, the Meta-Agent can also encapsulate the libraries it wants.

NileZhou commented 1 month ago

First, thanks for your question sincerely.

In my view, one of the core capabilities of GhostOS (just a mini subset):

If you want llm to handle a complex task (like debugging a project)

  1. what you define (an single AIFunc to solve this problem):
    
    # import a lot of things

class DebugAgentFn(AIFunc): request: str = Field(default="", description="raw request for the agent")

class DebugAgentFnResult(AIFuncResult): issue_culprit_file_parts: Set[CulpritFilePart] = Field(default=[], description="the file parts that caused the issue") err: Optional[str] = Field(default=None, description="error message")

...


There is no body/methods of xxxAIFunc or xxxAIFuncResult !
You don't need to implemet the AIFunc, just defined it !

2. what llm see (a Python code interface, generated by GhostOS/MOSS):
```text
...
You are an AIFunc named DebugAgentFn. AIFunc is a LLM-driven function that  could complete request by multi-turns thinking, And your purpose is to generate AIFuncResult DebugAgentFnResult as final result.

# code let llm know the context (generated by GhostOS/MOSS)
  1. what llm generated

    <code>
      def main(moss: Moss, fn: DebugAgentFn):
          # Define useful constants
          ...
    
          # Define the directory to start exploration
          ...
    
          # Initialize the FileContentOperations
          ...
    
          # Define the search string from the issue description
          ...
    
          # Explore each file containing the search string
          for file_path in files_containing_string:
              if cur_step >= max_steps:
                  break
    
              # Read the content of the file
              file_content = file_operations.read_file(file_path)
    
              # Create an instance of FileExplorerAIFunc to analyze the file
              file_explorer = FileExplorerAIFunc(
                  cur_object=file_path,
                  debug_task_ctx=fn.debug_task_ctx,
                  file_content=file_content
              )
    
              # Execute the FileExplorerAIFunc
              file_explorer_result = moss.ai_func_ctx.run(f"explore_{file_path}", file_explorer)
    
              if file_explorer_result.confidence_percentage >= confidence_percentage_requirement:
                  for culprit in file_explorer_result.found_culprits:
                      found_culprits.add(culprit)
    
              cur_step += 1
    
          # Create the result object
          result = DebugAgentFn(
              ......
          )
    
          return result, True
      </code>

Attention, when LLM invoke FileExplorerAIFunc (it's a new AIFunc) , the whole program will enter next func stack frame (the AIFunc even not be implemented FIleExploreAIFunc before!)

  1. executed code the pregenerated code context(generated and managed by MOSS) + main code, then execute the main code

  2. what's next llm see (enter the next frame of ) The result of FileExplorerAIFunc

  3. what's next llm generated next FIleExplorerAIFunc maybe different

  4. what's next executed code ....

From this example, when LLM want invoke tool/libraries, or plan, it using pure code (so it can generated workflow dynamiclly, the workflow is turing-complete). But in some times, LLM can't figure the detail execution out in just ONE turn, it will use some AIFunc which has no body, when code interpreter executing this line, GhostOS will let LLM to implement it first (might contains next AIFunc, so it can generate nest-workflow).

The difficult of implementing this example, is how we implement a mechanism like stack frame of programming language. Execute an AIFunc is like push frame in stacktrace, execute finished is like pop out from stacktrace. And the more difficult: how we maintain the correct code context and feed it into LLM in every frame(every turn of conversations). We resolved these challenges in MOSS/GhostOS

In general, GhostOS is not an OS for human, it focus on code-driven agent(This agent is far more stronger than traditional LLM + Code Interpreter in human-crafted workflow). Our vision and intention for this project: A LLM operate(coding) many things through prompts generated by GhostOS, just like a human operate many things through GUI generated by Windows/MacOS.

The correct category/tag of this project might be Code Agent + Code Agent's Environment + Code Agent's Evolving mechanism.