InftyAI / llmaz

☸️ Easy, advanced inference platform for large language models on Kubernetes. 🌟 Star to support our work!
Apache License 2.0
31 stars 10 forks source link

Is there any early proposal or document about integrating with Gateway API ? #165

Open caozhuozi opened 2 months ago

caozhuozi commented 2 months ago

I came across the roadmap and am particularly interested in the Gateway API section. Will Llamz support advanced traffic management features, such as shadow and canary deployments between different model services? If so, could you share how you plan to implement this?

Thanks in advance!

kerthcet commented 2 months ago

shadow and canary deployments between different model services

Thanks for you concern. The TL;DR is Yes, but no idea yet

I think it's a vital feature for production. Gateway API here means a bunch of things, like token/lora/model related service, canary deployments can also be part of them (maybe later we'll sort them clearly). And what llmaz usually does is we'll have a minimal implementation for out-of-box support, but we'll also provide project integrations considering people usually have lots of projects in their cluster, we don't want to increase the maintenance burden for them. Regarding to canary deployments, there maybe argo workflow, istio, so they're all in plan I think.

About the minimal implementation, I haven't thought too much about that, and we have a bunch of higher priority tasks on hand.

/kind feature

caozhuozi commented 2 months ago

Hi @kerthcet! Really thanks for your great pacience and detailed replay! ❤