Open mimowo opened 2 days ago
The proposal is to extend the workload PodSetTopologyRequest API with the following fields:
// PodIndexLabel indicates the name of the label indexing the pods.
// For example, in the context of
// - kubernetes job this is: kubernetes.io/job-completion-index
// - JobSet: kubernetes.io/job-completion-index (inherited from Job)
// - Kubeflow: training.kubeflow.org/replica-index
PodIndexLabel *string
// SubGroupIndexLabel indicates the name of the label indexing the instances of replicated Jobs (groups)
// within a PodSet. For example, in the context of JobSet this is jobset.sigs.k8s.io/job-index.
SubGroupIndexLabel *string
// SubGroupIndexLabel indicates the count of replicated Jobs (groups) within a PodSet.
// For example, in the context of JobSet this value is read from jobset.sigs.k8s.io/replicatedjob-replicas.
SubGroupCount *int32
The values could be then set when implementing the PodSets()
function in the GenericJob
interface via the
PodSetTopologyRequest
helper function like here.
Then, the API could be read from TopologyUngater, instead of the lookups.
cc @PBundyra @tenzen-y @mwielgus @mwysokin
What would you like to be added:
API which allows to use custom PodIndex labels for custom CRD jobs, without the incentive to use labels reserved for kubernetes in the in-house Jobs.
Why is this needed:
Completion requirements: