InftyAI / llmaz

☸️ Easy, advanced inference platform for large language models on Kubernetes
Apache License 2.0
13 stars 5 forks source link

Lack the flexibility to express deploy primitives #81

Open kerthcet opened 1 month ago

kerthcet commented 1 month ago

What would you like to be cleaned:

For example, people want to deploy the model with different scheduling primitives, colocated or exclusive?

Why is this needed:

Expressing deploy primitives.

kerthcet commented 1 month ago

/kind question /remove-kind cleanup

kerthcet commented 1 month ago

Right now, people can deploy a more advanced inference workload via Service, this is supported. But with playground, this is not workable.

kerthcet commented 1 month ago

However, especially for multi-host scenarios, topology is important, this is a key problem I think.

kerthcet commented 1 month ago

The general ides would be: