At present, quota for DRA resources is done similar to other resources, i.e. at admission time rather allocation time.
We propose to change the quota mechanism used for DRA resources to be done at allocation time instead.
Pros:
Can limit resource consumption based on what actually gets made available to a user, compared to basing it on what is requested (might be a lower limit).
Supports creating more claims and pods than can run at the moment ("batching" - might not be relevant).
Can support "one of" (if X exceeds quota, use Y).
Cons:
All schedulers need to also consider the ResourceQuota when checking devices.
Exceeding quota has to be reported as part of scheduling failures. OTOH, users typically also don't create ResourceClaims manually, so there is some indirection with admission checks, too.
We believe the pros outweigh the cons as it enables use cases such as putting quota on total amount of GPU memory allocated rather then strictly on number of devices allocated.
One-line enhancement description (can be used as a release note):
Enforce quota for DRA resources at allocation time instead of admission time
Enhancement Description
At present, quota for DRA resources is done similar to other resources, i.e. at admission time rather allocation time.
We propose to change the quota mechanism used for DRA resources to be done at allocation time instead.
Pros:
Cons:
We believe the pros outweigh the cons as it enables use cases such as putting quota on total amount of GPU memory allocated rather then strictly on number of devices allocated.
One-line enhancement description (can be used as a release note): Enforce quota for DRA resources at allocation time instead of admission time
Kubernetes Enhancement Proposal: TBD
Discussion Link: https://github.com/kubernetes-sigs/wg-device-management/issues/24
Primary contact (assignee): @klueska, @pohly, @johnbelamaric, @thockin
Responsible SIGs: /sig node /sig scheduling
Enhancement target (which target equals to which milestone):
[ ] Alpha
k/enhancements
) update PR(s):k/k
) update PR(s):k/website
) update PR(s):