Closed KristinaPlazonic closed 2 years ago
@treydock till we supported something like this, are there any directions you could give to using Slurm to reproduce what we do with OSC's Quick cluster?
@KristinaPlazonic at OSC our work around is a "Quick" cluster that we can configure interactive apps to submit to. This is a dedicated batch environment with a small number of compute nodes from each of our clusters and we use a special flag to specify which node type to request. The jobs are all 1 proc per node and the scheduler oversubscribes processors but provides dedicated access to memory, so that it acts similar to what you describe. We call these "VDI" but that is a misnomer since the actual technology is not VDI. But the experience is similar - you get the quick turn around time and its for light weight use.
I cc'd in @treydock because he has some ideas on how you would translate what we are doing with Torque to Slurm. I'm happy to provide more info if you are curious.
Recent relevant discussions on the mailing list:
You could configure SLURM to support gang scheduling: https://slurm.schedmd.com/gang_scheduling.html. You'd setup a dedicated partition that you want nodes to be oversubscribed on and set oversubscription level on the partition. This may not be friendly to interactive jobs as I think gang scheduling suspends jobs on periodic basis.
I never tried it but may be possible to define a set of nodes with more CPUs than actually exist but slurmd may throw errors if CPUs configured in slurm.conf don't match the running system.
The idea of our Torque/Moab "quick" cluster is already built into SLURM as the scheduler and resource manager are both capable of scheduling jobs and getting them running very quickly from time when they are submitted. Having a dedicated partition with nodes set aside for interactive work (assuming also support non-interactive) is another way to ensure people doing interactive work don't have long wait times. The more challenging part is oversubscription.
i would love for openondemand to support the ability to spin up dockerised applications on kubernetes rather than on batch. there are more scientific applications nowadays that are basically self-contained webapps (such as cryosparc) that i would like to 'host' on ondemand - just like i would a jupyterlab instance. i've been playing around with jupyter-server-proxy as a means to do something similar (ie proxy the webapp through jupyter/hub).
See a lot of value in this.
It's coming soon! We actually already have it, it's just a little unstable and needs lots of prep/system configuration that we don't have documented.
That's great news @johrstrom ! Perhaps sharing a K8s example project is enough for some of us to start trying it out and not having to wait for the full documentation.
OK yea, may as well get this party started! I hope this helps. And indeed, I'll respond through this ticket if you have any more questions. As it is new, it would be a good idea to start getting 2nd opinions.
Here's a simple description of what we have here: We deploy pods (a single container pod) into the users namespace. Every user has their own namespace and we assume you've got that all locked down with the right policies (pod security policies are being deprecated - so, we use Kyverno). A user of UID 1500 & GID 1501 will always start a container with those IDs. You can add supplemental groups automatically (below) - but you'll always force users to run containers as themselves - We set it up this way so I don't believe there's a way around it.
Here's a yaml for the cluster definition. Things of note
username_prefix
- this is the username in your ~/.kube/config so one user doesn't collide with another (test and production and so on).namespace_prefix
- all users get their own namespace, and this is the optional prefix for that namespace. We have kyverno policies to be sure they match that regular expression test-.+
.--context
. I think another setting is managed
where the ~/.kubeconfig is managed entirely outside of OnDemand and we don't pass/use or set a --context
, we just assume it's set already.---
v2:
metadata:
title: "Kubernetes"
hidden: true
job:
adapter: "kubernetes"
cluster: "ood-test"
bin: "/usr/local/bin/kubectl"
username_prefix: "test-"
namespace_prefix: "user-"
all_namespaces: false
auto_supplemental_groups: true
server:
endpoint: "https://your.k8s.host.edu"
cert_authority_file: "/some/cert/file/kubernetes-ca.crt"
auth:
type: "oidc"
batch_connect:
ssh_allow: false
For OIDC we provide some shell script hooks to run before the PUN starts up to setup a users ~/.kube/config.
https://github.com/OSC/ondemand/tree/master/hooks
We use pun_pre_hook_root_cmd
and pun_pre_hook_exports
(another undocumented feature) to run these hooks. (One thing of note here, as you may see in the hooks - if you pass an environment variable OIDC_FOO
you need to access in the hook scripts as OOD_OIDC_FOO
)
We've updated our jupyter app to use kubernetes, though we're treating the container much more like the regular Slurm infrastructure in that it's a base image and we mount all sorts of stuff to get SSSD and the module system to work. So, nothing is really inside the container - it just looks like our base OS so that we can use host mounted files.
Thanks a lot for the details, indeed right now our biggest challenge is the IAM (identity and access management) part, we're also incorporating OIDC. Question, so you have moved completely away from Slurm into K8s? is it feature complete? e.g. can users ssh into the pods? also from the command line? (I guess they simply run kubectl exec ...) Do the user still need to use K8s port forwarding or OOD covers all those use cases?
Question, so you have moved completely away from Slurm into K8s?
No, only some interactive work. Specifically classroom support for say a 2000 level statistics class that uses RStudio. In fact, our kubernetes adapter only supports 1 container in 1 pod, so it's really only suited to interactive work right now.
is it feature complete?
We're using it as described above. So, it's as complete as that use case is.
can users ssh into the pods?
They cannot at OSC. I think they'd need kubectl exec
, which we don't allow. And also we added the feature to disable the ssh button per cluster (which is the ssh_allow
cluster yaml above). There's some other bootstrapping we do in those hooks to give this or that, but generally it's pretty restrictive - or at least we only allow users to do just what they need to and no more.
Do the user still need to use K8s port forwarding or OOD covers all those use cases?
We create NodePort
service. See: https://github.com/OSC/ood_core/blob/c015410f5b331470a47c1de8aa6696f0c476eda7/lib/ood_core/job/adapters/kubernetes/templates/pod.yml.erb
At long last, I can say this is complete. I've just published the documentation for kubernetes. Of course, there's still more to do, especially in terms of documentation, but I can say that we do now support Kubernetes in 2.0.
I'll keep this ticket aprised when there are other major updates, like documentation being complete or another patch we're going to supply in 2.0.19.
https://osc.github.io/ood-documentation/latest/installation/resource-manager/kubernetes.html
We are using OnDemand for teaching computational science classes (using Slurm scheduler). These classes use Jupyter notebooks and RStudio, but usually these interactive jobs don't require a lot of resources. We could conserve resources by packing several Jupyter notebooks on the same cores/memory. So it would be really useful for scaling instructional infrastructure if we could launch jupyter notebooks and RStudio on Kubernetes (RStudio already launches as a container!) instead of Slurm. For example, JupyterHub is deployable on Kubernetes already.
Thanks for your consideration!