InftyAI / llmaz

☸️ Easy, advanced inference platform for large language models on Kubernetes
Apache License 2.0
25 stars 10 forks source link

Support Deployment for serving most models #32

Open kerthcet opened 3 months ago

kerthcet commented 3 months ago

We support lws as the default workload, however, most of the cases mutli-hosts is not needed, even with Llama3.1 405B. So maybe this is a better choice.

kerthcet commented 3 months ago

/kind feature

kerthcet commented 3 months ago

The most related part is the API design.

kerthcet commented 3 months ago

/milestone v0.1.0

kerthcet commented 3 months ago

/priority important-soon

kerthcet commented 3 months ago

However, this leads to the complexities of workload orchestrations.

kerthcet commented 3 months ago

Let's give it up for now since the complexities. /milestone clear

kerthcet commented 3 months ago

/priority backlog

kerthcet commented 3 months ago

/remove-priority important-soon

kerthcet commented 3 months ago

/kind question /remove-kind feature