Add a soft affinity for non-GPU nodes

opendatahub-io-contrib / jupyterhub-odh

Example JupyterHub deployment using OpenShift OAuth authenticator.

16 stars 31 forks source link

Add a soft affinity for non-GPU nodes #135

Closed Xaenalt closed 2 years ago

Xaenalt commented 2 years ago

Causes notebooks to prefer non-GPU nodes unless a GPU is explicitly requested or no non-GPU nodes are available

Related Issues and Dependencies

This is a solution proposed in https://issues.redhat.com/browse/RHODS-3074

This introduces a breaking change

[ ] Yes
[X] No

This Pull Request implements

This allows us to have pods always prefer non-GPU nodes unless a GPU is explicitly requested. As mentioned in the JIRA, this probably is what users expect. This shouldn't break anything, tested on my cluster and it worked. If all CPU nodes are full (or unable to schedule onto them, etc), it will go onto a GPU node.

There may be a more elegant way to do this, but that's the gist of the change

Description

Using that preferredDuringSchedulingIgnoredDuringExecution affinity, a notebook will always prefer non-GPU nodes, however adding a GPU in resource requests will force it onto a GPU node

LaVLaS commented 2 years ago

I don't think we should be adding this as part of the default functionality for ODH JupyterHub. This would be best as an optional feature that users can enable and configure as part of the jupyterhub-singleuserprofiles in the same way that they can apply node tolerations for gpu notebooks

Xaenalt commented 2 years ago

That's fair, this was an attempt to solve the issue listed in the JIRA. There's also the way we've recommended which was node taints/tolerations, this does try to kind of thread the needle between both options

It's worth noting, if you're requesting GPUs, you still get them, this will just have non-GPU workloads prefer non-GPU nodes

Xaenalt commented 2 years ago

Made 2 changes: Now only adds the affinity if the user doesn't specify any GPUs in requests Merges the affinity dict with any that might already exist

LaVLaS commented 2 years ago

@Xaenalt The ability to apply nodeAffinity to notebook pods is already a supported feature of jupyterhub-singlesuerprofiles (JSP)

The docs for enabling and configuring it are located here - https://github.com/opendatahub-io/jupyterhub-singleuser-profiles/blob/master/docs/configuration.md

Xaenalt commented 2 years ago

Opened https://github.com/opendatahub-io/odh-manifests/pull/556 as an alternate solution

Xaenalt commented 2 years ago

The main difference between these approaches is this one allows for adding the affinity conditionally only if the user doesn't request a GPU. That might be preferable to minimize confusion to the user if one is debugging something, though the approaches should be identical in functionality

LaVLaS commented 2 years ago

I don't think we should be automatically applying this in the jupyterhub_config for all non-gpu configs. I still believe this is better supported in JSP to allow a user to enable it by choice for non_gpu pods with a custom JSP configmap. We already have support for user configuration of affinity and this conditional cpu only support should be added in JSP to support non_gpu affinity