InftyAI / llmaz

☸️ Easy, advanced inference platform for large language models on Kubernetes
Apache License 2.0
13 stars 5 forks source link

Integrate with Kueue for fungibility capacity #74

Open kerthcet opened 1 month ago

kerthcet commented 1 month ago

What would you like to be added:

Kueue is a great project which focus on job queueing and resource management, it can also support inference service by managing Pods, it's efficient because we have the overview of the cluster and we know much whether the GPU kinds are insufficient or not, comparing to runtime failover.

What's more, if kueue is already part of your component, it would be really great!

Why is this needed:

Fungibility capacity.

Completion requirements:

This enhancement requires the following artifacts:

The artifacts should be linked in subsequent comments.

kerthcet commented 1 month ago

/kind feature