support Kubernetes as a task queue

vatlab / sos

SoS workflow system for daily data analysis

http://vatlab.github.io/sos-docs

BSD 3-Clause "New" or "Revised" License

274 stars 45 forks source link

support Kubernetes as a task queue #1260

Open tmbdev opened 5 years ago

tmbdev commented 5 years ago

Kubernetes is a powerful and full-features task queue system for containers. It can easily be run locally, on a local cluster, or in the cloud. It supports GPUs, resource selectors, and error handling. Task queuing is directly supported via jobs, or can be implemented via pods.

It would be great if Kubernetes could be used directly as a task queue system from SoS and if SoS understood Kubernetes jobs, resources, volumes, and container registries.

BoPeng commented 5 years ago

I tried to learn kubernetes but was scared away with all the new concepts, commands etc. I knew that things will get easier once I spend more time but I never found the time needed. Support for k8s and other cloud/distributed systems is in the radar but I will need some serious help (e.g. from a grant or other individuals/groups) to work on them.

Anyway, currently SoS' task system has the following parts

SoS workflow system that creates tasks (physical files), monitor their status, and retrieve results.
Host agents that work with local or remote hosts and execute commands.
Task engines that submits tasks to the actual computational environment.

Properties of the host agents and task engines are stored in a configuration file ~/.sos/hosts. Because sos tasks monitor their own status (a pulse thread will update the status of tasks), the task engines do not have to monitor the status of submitted jobs, which greatly simplifies the writing of task engines.

Now, with close to zero knowledge on k8s, I think

A separate host agent would be needed to interact with k8s. SoS currently only supports localhost and ssh/rsync/rcp based remote host so a separate plugin system to abstract host agent will be needed.
A task engine module similar to sos-pbs, which will basically execute a shell script with a sos execute command in it.

@tmbdev What is your motivation to get sos working with k8s? Will you be able to help even contribute a k8s module to SoS?

Edit: Note that one of the key design goals of SoS is to keep multi-language (polyglot notebook) and multi-platform (local and remote hosts) computations in one notebook, so the priority would be on submitting tasks to remote k8s systems, instead of running sos workflows on k8s systems.

tmbdev commented 5 years ago

K8s has a lot of functionality for a lot of different applications. But as far as SoS is concerned, it would simply function like any other job queueing system. You write a job spec in YAML or JSON. A job spec consists of a job name, container name, a command line, and resource requirements. The command line shows how much like any other job queuing system this is:

$ kubectl apply -f myjob.yaml # submit job
$ kubectl delete job/myjob # kill job
$ kubectl logs job/myjob # retrieve stdout/stderr
$ kubectl get jobs # list jobs and their status
$ kubectl exec ... # execute additional commands inside an existing job

There are Python libraries that encapsulate the above command lines, so a interface could be written either in terms of kubectl or in terms of the library.

You don't need to worry about hosts, authentication, multiple clusters, or agents at all; K8s already takes care of that. From the point of view of SoS, K8s just looks like a local job queuing system; an interface should be easy to write.

What's the motivation? Kubernetes is a highly scalable container-based job queuing system with tons of additional functionality, support for persistent services, integration with many different kinds of storage architectures, native GPU support, and excellent resource management. It's easy to use, and it works the same on a desktop machine, a local cluster, and in the cloud.

Yes, I can help with K8s integration, but I can't write one from scratch since I don't really understand how TaskEngine is supposed to work. There doesn't seem to be an abstract base class for task engines that describes what methods need to be provided or what they should do. I can't even find the source code for the TaskSpooler interface, which might be closest to what we need for K8s support.

(Incidentally, if you want to experiment with K8s, I found microk8s http://microk8s.io to be the simplest to install on a local machine.)

BoPeng commented 5 years ago

Ok, let me start by creating a sos-kubernetes module and picking up my k8s kindle book...

BoPeng commented 2 years ago

@tmbdev I apologize for all the delay. A lot has happened in the past few years (change of job, pandemic) and I only have access to a production k8s cluster recently and am able to learn how to use it.

I really like the concept of k8s cluster and can use it for my work now. I am working on the sos-kubernetes module but there are some practical issues. More specifically, currently, sos copies task files to a remote system (e.g. cluster) with sos installed, then it calls sos execute to execute the task.

It is not easy to copy task files to k8s cluster since there is no "destination" per se. To solve this, we will have to copy the task file into the container, right?
Suppose we can build the container locally (since we need to COPY files over), we will need some way that allows users to specify dependencies (e.g. python and R packages). The easiest way may be allowing users to supply a Docker file.
Then sos can create yaml files for pod or deployment, and apply them to the cluster. The sos syntax should translate mem=2G to resource in yaml file. However, I have not figured out a way to push local containers to the cluster without going through a docker hub service.

Once we get these problems resolved, I will design an interface and write the k8s task engine.

tmbdev commented 2 years ago

It is not easy to copy task files to k8s cluster since there is no "destination" per se. To solve this, we will have to copy the task file into the container, right?

For configuration data, scripts, etc., you can use a ConfigMap. You can create that with a k8s YAML spec and mount it as a files system; that works everywhere.

(By the way, you should not do anything that requires volume plugins or mountable file systems, since that tends to fail in a lot of environments.)

Suppose we can build the container locally (since we need to COPY files over), we will need some way that allows users to specify dependencies (e.g. python and R packages). The easiest way may be allowing users to supply a Docker file.

No, the user should supply a container, not a Docker file. People increasingly use other tools than docker for building their containers. Furthermore, building containers under programmatic control is difficult.

If you want to ensure that certain packages or files are present, you can install them during the initialization of the container.

Then sos can create yaml files for pod or deployment, and apply them to the cluster. The sos syntax should translate mem=2G to resource in yaml file. However, I have not figured out a way to push local containers to the cluster without going through a docker hub service.

That's one reason why you should avoid building containers programmatically from within SoS: where and how containers get built needs to be under user control.

So, here is my suggestion:

users specify a docker container for running their scripts; users are responsible for ensuring that all the necessary packages are installed inside that container
use a ConfigMap to make configuration files and scripts available to all containers
use a startup script to install anything that SoS needs that may not be in the user's container before running the SoS client scripts (it's probably a good idea to "pip install --upgrade sos" in the startup script to make sure it's present and it's a consistent version)

For more complex needs, you can always run additional pods on the K8s server. For example, you can run etcd or redis for coordination, and/or you can run Minio or AIStore for large scale object storage and/or data transfers.

BoPeng commented 2 years ago

For configuration data, scripts, etc., you can use a ConfigMap. You can create that with a k8s YAML spec and mount it as a files system; that works everywhere.

This sounds like the example in the documentation. The problem is that sos tasks files contain both parameters (and scripts ...) and results (after tasks are completed) so we do need some way to retrieve the task files after tasks are completed.

(By the way, you should not do anything that requires volume plugins or mountable file systems, since that tends to fail in a lot of environments.)

I agree but users can specify a template YAML file that contains PV, PVC, Volume etc. This is hard to avoid because users most likely need to specify their own volumes to process data.

No, the user should supply a container, not a Docker file.

Agreed. SoS would have to build containers if it includes the tasks inside the containers, which is certainly a bad idea.

So, here is my suggestion:

Sounds reasonable but I have to learn more about the use of startup scripts.

For example, you can run etcd or redis for coordination,

It was probably a bad design but SoS avoided using separate instances for coordination. This made running sos scripts relatively easy but made a lot of things more complicated than they should (e.g. tasks status could be checked much more efficiently with a redis server).

tmbdev commented 2 years ago

It was probably a bad design but SoS avoided using separate instances for coordination. This made running sos scripts relatively easy but made a lot of things more complicated than they should (e.g. tasks status could be checked much more efficiently with a redis server).

I think that's a perfectly reasonable decision in many environments, where you simply can't start up persistent services.

However, in Kubernetes, it's easy to start up servers. So, I would take advantage of that for the SoS Kubernetes integration. That is, I would write a Kubernetes-specific backend for SoS that uses Redis (or some other server like that) for exchanging data.

On Sun, May 1, 2022 at 7:49 AM Bo @.***> wrote:

For configuration data, scripts, etc., you can use a ConfigMap. You can create that with a k8s YAML spec and mount it as a files system; that works everywhere.

This sounds like the example in the documentation https://kubernetes.io/docs/concepts/configuration/configmap/. The problem is that sos tasks files contain both parameters (and scripts ...) and results (after tasks are completed) so we do need some way to retrieve the task files after tasks are completed.

(By the way, you should not do anything that requires volume plugins or mountable file systems, since that tends to fail in a lot of environments.)

I agree but users can specify a template YAML file that contains PV, PVC, Volume etc. This is hard to avoid because users most likely need to specify their own volumes to process data.

No, the user should supply a container, not a Docker file.

Agreed. SoS would have to build containers if it includes the tasks inside the containers, which is certainly a bad idea.

So, here is my suggestion:

Sounds reasonable but I have to learn more about the use of startup scripts.

For example, you can run etcd or redis for coordination,

It was probably a bad design but SoS avoided using separate instances for coordination. This made running sos scripts relatively easy but made a lot of things more complicated than they should (e.g. tasks status could be checked much more efficiently with a redis server).

— Reply to this email directly, view it on GitHub https://github.com/vatlab/sos/issues/1260#issuecomment-1114256612, or unsubscribe https://github.com/notifications/unsubscribe-auth/AACRQP7BTCWGWFTCHYN2NNLVH2KXXANCNFSM4HJUADBQ . You are receiving this because you were mentioned.Message ID: @.***>

BoPeng commented 2 years ago

However, in Kubernetes, it's easy to start up servers. So, I would take advantage of that for the SoS Kubernetes integration. That is, I would write a Kubernetes-specific backend for SoS that uses Redis (or some other server like that) for exchanging data.

This is great advice. It would be much easier to pass tasks and their status around if we start a separate redis server inside k8s. Actually, we do not have to implement k8s as a "task executor". Instead, we can implement k8s as a "workflow executor" that has a coordinator and multiple workers.