kubernetes-sigs / jobset

JobSet: a k8s native API for distributed ML training and HPC workloads
https://jobset.sigs.k8s.io/
Apache License 2.0
138 stars 46 forks source link

feature: Topology Domain with JobSet #637

Open googs1025 opened 2 months ago

googs1025 commented 2 months ago

What would you like to be added:

Why is this needed: As far as I know, JobSet implements the topology domain scheduling function. However, after testing, I found that it distinguishes whether there is a node label. For example, if there is a "node-group" label on the node label, JobSet can be scheduled, and if there is no "node-group" label, JobSet cannot be scheduled.

whiteboard_exported_image

In practice, we often use a single label to mark different areas or to distinguish different businesses to form a node pool. For example: node-group=group1, node-group=group2, etc. When I tested this method, I found that the existing alpha.jobset.sigs.k8s.io/exclusive-topology could not meet this scenario. Do we need to consider this scenario? whiteboard_exported_image (1)

This enhancement requires the following artifacts:

The artifacts should be linked in subsequent comments.

googs1025 commented 2 months ago

/kind feature

googs1025 commented 2 months ago

"alpha.jobset.sigs.k8s.io/exclusive-topology" This label can be used to distinguish certain node labels, for example, certain nodes have GPUs or certain nodes have specific network partitions, etc.

danielvegamyhre commented 1 month ago

To clarify, exclusive job placement per topology means each Job's pods will be colocated within a group of nodes with the same value for the given node label (e.g., all nodes with the label "cloud.google.com/gke-nodepool=my-nodepool").

Once a pod from a given Job has landed on a node, no other Job's pods will be allowed to land on nodes with the label "cloud.google.com/gke-nodepool=my-nodepool" - the first Job has exclusive usage of them.

In practice, we often use a single label to mark different areas or to distinguish different businesses to form a node pool. For example: node-group=group1, node-group=group2, etc. When I tested this method, I found that the existing alpha.jobset.sigs.k8s.io/exclusive-topology could not meet this scenario.

I'm not sure what you mean here. If we have node pools where each pools nodes are grouped via node labels (e.g. cloud.google.com/gke-nodepool=A, cloud.google.com/gke-nodepool=B, etc) then exclusive job placement per node pool via specifying alpha.jobset.sigs.k8s.io/exclusive-placement=cloud.google.com/gke-nodepool is supported and well-tested.