kubernetes-sigs / llm-instance-gateway

LLM Instance gateway implementation.
Apache License 2.0
49 stars 10 forks source link

Gateways supporting LLMServerPool as a an HTTPRoute BackendRef #19

Open kfswain opened 1 day ago

kfswain commented 1 day ago

Our current Envoy integration relies on EnvoyExtensionPolicy and EnvoyPatchPolicy this is very manual, and not sustainable. (See: https://github.com/kubernetes-sigs/llm-instance-gateway/pull/18)

We're trying to settle on a single implementation that this project will work on to extend to support LLMServerPool as a Gateway API backend. This will enable us to run e2e tests against these concepts and iterate more quickly. That implementation should be:

  1. An existing conformant implementation of Gateway API
  2. Part of CNCF
  3. Envoy-based for simplicity of extension mechanisms
  4. Open to contributions from us to support this new type of backend

We propose extending existing this gateway implementation to act as the controller for the LLMServerPool object. (See: https://github.com/kubernetes-sigs/llm-instance-gateway/blob/main/docs/proposals/002-api-proposal/proposal.md#llmserverpool). As well as updating HTTPRoute to support a LLMServerPool as a backendRef.

At a high level we expect this to look like:

arkodg commented 1 day ago

hey @kfswain ive created https://github.com/envoyproxy/gateway/issues/4423 to make a decision on supporting this in Envoy Gateway