anacostiaAI / anacostia-pipeline

Anacostia is a framework for creating machine learning operations (MLOps) pipelines
Apache License 2.0
2 stars 2 forks source link

figure out how to automatically spin up nodes in a distributed systems architecture #12

Open mdo6180 opened 5 months ago

mdo6180 commented 5 months ago

How Anacostia Servers work:

  1. Server 1 spins up a copy of its subgraph.
  2. Server 1 sends a request to Server 2 to spin up a copy of Server 2's subgraph.
  3. Server 2 spins up a copy of its subgraph (subgraph is defined by the user) by spinning up all the nodes in the subgraph.
  4. Server 2 sends an ACK to Server 1 when its subgraph is set up.
  5. Server 2's subgraph starts listening for signals from Server 1's subgraph (signals are sent via RPC)
  6. Server 1's subgraph starts executing.

Note: for now, assume that a DAG can only be split into two subgraphs on server 1 and server 2. however, keep in mind that in the future, we may allow for DAGs like the following:

server1, server2, and server3 all spin up one copy of a subgraph:

server1 -> server2 -> server3

server3 spins up two copies of its subgraph, one copy to serve server1, the other subgraph to serve server2; i.e., each machine runs a subgraph as a service. (subgraph as a service - SaaS)

server2 
       \
        server3
       /
server1

server1's subgraph connects to the subgraphs of server2 and server3

         server2 
        /
server1 
        \
         server3
mdo6180 commented 4 months ago

@Veryyes