InftyAI / llmaz

☸️ Easy, advanced inference platform for large language models on Kubernetes
Apache License 2.0
13 stars 5 forks source link

Model aware scheduling #96

Open kerthcet opened 3 weeks ago

kerthcet commented 3 weeks ago

What would you like to be added:

Right now, model management is a tricky problem in the cluster, it's big, so we need to cache them in the node just like images, however, kubelet will take over the image lifecycle management but files, so that's a problem, and will not be tacked in the near future, so maybe we need to manage the models manually and make it aware by the scheduler to make pod placement decisions.

Why is this needed:

Efficient pod scheduling with models

Completion requirements:

This enhancement requires the following artifacts:

The artifacts should be linked in subsequent comments.

kerthcet commented 3 weeks ago

/kind feature