The capability for a cluster admin to configure Kueue to customize the computation it does to derive the Resource requirements of a Workload from the Resource requests/limits in the PodSpecs of the submitted Job.
Why is this needed:
Configurable Resource transformations would enable more flexible definitions of Quotas that can be both simpler and more powerful than those possible via simple mirroring of the PodSpec Resources of Jobs into Workloads. It would support at least the following scenarios:
Reducing multiple complex related accelerator resources into a simpler resource that is more suitable for quota management. The motivation example here is the various MIG resources created by the NVIDIA CPU Operator when it is operating in a mixed strategy.
Mapping multiple resources into an abstract currency that can be used to define quotas in terms of the relative cost of the resources (eg cheap vs. expensive GPUs or spot vs normal cloud VMs).
What would you like to be added:
The capability for a cluster admin to configure Kueue to customize the computation it does to derive the Resource requirements of a Workload from the Resource requests/limits in the PodSpecs of the submitted Job.
Why is this needed:
Configurable Resource transformations would enable more flexible definitions of Quotas that can be both simpler and more powerful than those possible via simple mirroring of the PodSpec Resources of Jobs into Workloads. It would support at least the following scenarios:
Reducing multiple complex related accelerator resources into a simpler resource that is more suitable for quota management. The motivation example here is the various MIG resources created by the NVIDIA CPU Operator when it is operating in a mixed strategy.
Mapping multiple resources into an abstract currency that can be used to define quotas in terms of the relative cost of the resources (eg cheap vs. expensive GPUs or spot vs normal cloud VMs).
Both scenarios were discussed in the Batch WG call of 8/29/24 (https://www.youtube.com/watch?v=5nb_Ut-PLac), resulting in a decision to open a KEP to refine a design for this capability. The presentation is attached here: BatchWG-MIGResourceAbstraction.pdf
Completion requirements:
This enhancement requires the following artifacts:
The artifacts should be linked in subsequent comments.