InftyAI / llmaz

☸️ Easy, advanced inference platform for large language models on Kubernetes
Apache License 2.0
25 stars 10 forks source link

Feat: support sglang backend #46

Closed vicoooo26 closed 3 months ago

vicoooo26 commented 3 months ago

Support sglang backend Ref https://github.com/InftyAI/llmaz/issues/39

kerthcet commented 3 months ago

/kind feature

kerthcet commented 3 months ago

/hold

kerthcet commented 3 months ago

/hold cancel

kerthcet commented 3 months ago

Can you provide an example looks like https://github.com/InftyAI/llmaz/tree/main/docs/examples/huggingface? If you have no environment, I can help you with that as a follow up.

So generally if we have a new backend or new datasource, we should have examples for references, which is friendly for users. but yes that can be a follow up. :)

vicoooo26 commented 3 months ago

Sure, I will manage to add some docs later

vicoooo26 commented 3 months ago
  1. Update .golangci.yaml to prevent CI issues
  2. Update README.md and keep doc dir flat by adding backend prefix for playground to distinguish different backend

BTW I think we could allow user to override backend to their own image registry by adding DefaultImage method

kerthcet commented 3 months ago

Update .golangci.yaml to prevent CI issues

Thanks @vicoooo26 when reviewing this part, it seems weird to me because I already changed that configurations, but the diff code is still the rotten part, see https://github.com/InftyAI/llmaz/blob/f2085013a435d694a31f1c6a88f679f0f3ad1d02/.golangci.yaml#L20-L23

Any way would not block this PR, can refactor that part later.

BTW I think we could allow user to override backend to their own image registry by adding DefaultImage method

If you mean providing entrypoint for users to set the image in playground, yes, we can. My original thought is Playground should be as simple as we can. For people who has little knowledge about container, but they know vllm, sglang, tgi very much, they can breezily deploy a model with Playground. However for advanced deployment, they can deploy the Service directly.

At a hight level, Playground is for quick start and easy of use. Service is for extensible, so you can deploy any model with any inference engine as you want. For both deployment approaches, we'll provide scaling, metrics, fungibility support for them.

This is the general design about llmaz, who knows, we may change in the future, but this requires feedback from users.

vicoooo26 commented 3 months ago

Update .golangci.yaml to prevent CI issues

Thanks @vicoooo26 when reviewing this part, it seems weird to me because I already changed that configurations, but the diff code is still the rotten part, see

https://github.com/InftyAI/llmaz/blob/f2085013a435d694a31f1c6a88f679f0f3ad1d02/.golangci.yaml#L20-L23

Any way would not block this PR, can refactor that part later.

BTW I think we could allow user to override backend to their own image registry by adding DefaultImage method

If you mean providing entrypoint for users to set the image in playground, yes, we can. My original thought is Playground should be as simple as we can. For people who has little knowledge about container, but they know vllm, sglang, tgi very much, they can breezily deploy a model with Playground. However for advanced deployment, they can deploy the Service directly.

At a hight level, Playground is for quick start and easy of use. Service is for extensible, so you can deploy any model with any inference engine as you want. For both deployment approaches, we'll provide scaling, metrics, fungibility support for them.

This is the general design about llmaz, who knows, we may change in the future, but this requires feedback from users.

After checking the Github CI action execution history, I found that the previous CI issues happened before PR https://github.com/InftyAI/llmaz/pull/64.
Since both you and I added this rule, it's enough to keep just one.

kerthcet commented 3 months ago

/lgtm /approve Thanks @vicoooo26