Kueue is a great project which focus on job queueing and resource management, it can also support inference service by managing Pods, it's efficient because we have the overview of the cluster and we know much whether the GPU kinds are insufficient or not, comparing to runtime failover.
What's more, if kueue is already part of your component, it would be really great!
Why is this needed:
Fungibility capacity.
Completion requirements:
This enhancement requires the following artifacts:
[x] Design doc
[ ] API change
[x] Docs update
The artifacts should be linked in subsequent comments.
What would you like to be added:
Kueue is a great project which focus on job queueing and resource management, it can also support inference service by managing Pods, it's efficient because we have the overview of the cluster and we know much whether the GPU kinds are insufficient or not, comparing to runtime failover.
What's more, if kueue is already part of your component, it would be really great!
Why is this needed:
Fungibility capacity.
Completion requirements:
This enhancement requires the following artifacts:
The artifacts should be linked in subsequent comments.