InftyAI / llmaz

☸️ Easy, advanced inference platform for large language models on Kubernetes. 🌟 Star to support our work!
Apache License 2.0
30 stars 10 forks source link

Integrate with Kueue for fungibility capacity #74

Open kerthcet opened 3 months ago

kerthcet commented 3 months ago

What would you like to be added:

Kueue is a great project which focus on job queueing and resource management, it can also support inference service by managing Pods, it's efficient because we have the overview of the cluster and we know much whether the GPU kinds are insufficient or not, comparing to runtime failover.

What's more, if kueue is already part of your component, it would be really great!

Why is this needed:

Fungibility capacity.

Completion requirements:

This enhancement requires the following artifacts:

The artifacts should be linked in subsequent comments.

kerthcet commented 3 months ago

/kind feature