MatrixAI / Emergence

Distributed Infrastructure Orchestration
Apache License 2.0
1 stars 0 forks source link

Workflow Systems, Dataflow Systems and Build Systems #34

Open CMCDragonkai opened 6 years ago

CMCDragonkai commented 6 years ago

These 3 systems implies a more "batch" like workload rather than just running and maintaining networks of distributed services.

We would like to extend Matrix Emergence to handle these kinds of workloads that have different tradeoffs required.

This issue will track the research and implementation that goes into this:

CMCDragonkai commented 6 years ago

An example of batch task is an ML training system. Something that starts and ends, but doesn't run as a service. Any batch task in Matrix should be convertible to an event-driven service. But doing so requires consideration on the nature of the "event" (socket activated or otherwise), and how the batch Automaton starts and ends. One main issue is that the thing that starts the Automaton, does it represent a "recursive" representation of the Emergence orchestrator, that is, itself is an orchestrator, or is it relying on the global Emergence system to orchestrate the batch Automaton? Basically on a first-order basis, we think that the management of the lifecycle of Automatons is the job of the Emergence. But this only works if everything has the same global lifetime as each other. Batch tasks introduce the idea that an Automaton's lifetime is actually determined by application-level parameters.

Let's consider the tradeoffs of the 2 approaches:

CMCDragonkai commented 6 years ago

There is an existing concept called "super server" https://en.wikipedia.org/wiki/Super-server that was the progenitor of socket activation in systemd http://0pointer.de/blog/projects/socket-activated-containers.html and http://0pointer.de/blog/projects/socket-activation.html.

It shows how event driven services can encapsulate other event driven services, and thus event driven services can definitely encapsulate batch tasks.

Note that xinetd and inetd was one of the first systems to implement on Linux.