Lack the flexibility to express deploy primitives

InftyAI / llmaz

☸️ Easy, advanced inference platform for large language models on Kubernetes

Apache License 2.0

28 stars 10 forks source link

Open kerthcet opened 3 months ago

kerthcet commented 3 months ago

What would you like to be cleaned:

For example, people want to deploy the model with different scheduling primitives, colocated or exclusive?

Why is this needed:

Expressing deploy primitives.

kerthcet commented 3 months ago

/kind question /remove-kind cleanup

kerthcet commented 3 months ago

Right now, people can deploy a more advanced inference workload via Service, this is supported. But with playground, this is not workable.

kerthcet commented 3 months ago

However, especially for multi-host scenarios, topology is important, this is a key problem I think.

kerthcet commented 3 months ago

The general ides would be: