Consider clarifying, is this an alternative for Kubeflow?

Hi @austinkeller

Or is it a more R&D-friendly replacement?

Kind of, but also integrates with Kubeflow :)

Specifically, Kubeflow assumes all steps are self contained containers, and that data can be volume mounted etc. In this aspect trains-agent solves the containerization problem and adds logging into the process.

To understand how trains work, usually the dev steps are:

Write code on "local" machine. Using trains all the code/environment/arguments are logged (including a few other stuff, but less relevant to our case)
Clone experiment in UI (or from code / automation)
Put code into execution queue (the trains scheduler,it also includes priorities etc, with UI as part of the system UI, see trains-server)
trains-agent running on remote machine in daemon setup, pulls the experiment from the execution queue, sets the environment accordingly and launch / monitor the process

Back to KubeFlow, since creating the experiment is done automatically (see step (1) trains records the environment and creates the experiment in runtime), trains-agent can build a docker container for the experiment to later be used by Kubeflow. This makes the packaging a lot easier (see trains-agent build --docker) . You can actually make it even lighter, and use trains-agent to setup and launch an experiment without packaging the experiment, but by using a base container and letting trains-agent setup everything inside the container (see trains-agent execute).

Does that remove a bit of the mystery ? What exactly is your use case ? (Is it more development oriented, or productization stage ?)

allegroai / clearml-agent

Consider clarifying, is this an alternative for Kubeflow? #32