alibaba / higress

🤖 AI Gateway | AI Native API Gateway
https://higress.io
Apache License 2.0
3.59k stars 535 forks source link

关于higress是否支持根据model路由的提问 #1424

Open ilovedumplings opened 1 month ago

ilovedumplings commented 1 month ago

Why do you need it?

当我们的服务下存在多个模型的时候,通过openapi的协议,我们想通过统一的域名然后将不同的model路由到不同的服务上去

How could it be?

A clear and concise description of what you want to happen. You can explain more about input of the feature, and output of it.

Other related information

Add any other context or screenshots about the feature request here.

CH3CHO commented 1 month ago

具体需要分流的model都有哪些呢,规则是什么,精确匹配还是?

ilovedumplings commented 1 month ago

model

比如说图片模型,chat模型想通过同一个域名配置入口,匹配规则就是根据model的name,前缀或者精确都可以

CH3CHO commented 1 month ago

请 @johnlanni 看一下 model-router 能否满足需求

johnlanni commented 1 month ago

https://github.com/alibaba/higress/tree/main/plugins/wasm-cpp/extensions/model_router

这个插件的功能满足你的需求吗,我们后面会产品化这个能力。类似openrouter的用法:

https://openrouter.ai/docs/model-routing

johnlanni commented 1 month ago

https://github.com/alibaba/higress/tree/main/plugins/wasm-cpp/extensions/model_router 这个插件的功能满足你的需求吗,我们后面会产品化这个能力。类似openrouter的用法: https://openrouter.ai/docs/model-routing

这个会重写model嘛?

嗯,重写去掉provider的部分,保留model的部分,可以看下readme:https://github.com/alibaba/higress/tree/main/plugins/wasm-cpp/extensions/model_router

ilovedumplings commented 1 month ago

https://github.com/alibaba/higress/tree/main/plugins/wasm-cpp/extensions/model_router 这个插件的功能满足你的需求吗,我们后面会产品化这个能力。类似openrouter的用法: https://openrouter.ai/docs/model-routing

这个会重写model嘛?

嗯,重写去掉provider的部分,保留model的部分,可以看下readme:https://github.com/alibaba/higress/tree/main/plugins/wasm-cpp/extensions/model_router

嗯嗯,我刚没看仔细, 在实际使用当中,是不是就是一个路由只能配置一个model映射插件?

johnlanni commented 1 month ago

嗯 我上面说的产品化,是基于这个插件能力,增加基于model参数匹配路由的能力,就是路由上可以配置匹配model参数中的provider

ilovedumplings commented 1 month ago

https://github.com/alibaba/higress/tree/main/plugins/wasm-cpp/extensions/model_router 这个插件的功能满足你的需求吗,我们后面会产品化这个能力。类似openrouter的用法: https://openrouter.ai/docs/model-routing

这个会重写model嘛?

嗯,重写去掉provider的部分,保留model的部分,可以看下readme:https://github.com/alibaba/higress/tree/main/plugins/wasm-cpp/extensions/model_router

https://github.com/alibaba/higress/tree/main/plugins/wasm-cpp/extensions/model_router 这个插件的功能满足你的需求吗,我们后面会产品化这个能力。类似openrouter的用法: https://openrouter.ai/docs/model-routing

这个会重写model嘛?

嗯,重写去掉provider的部分,保留model的部分,可以看下readme:https://github.com/alibaba/higress/tree/main/plugins/wasm-cpp/extensions/model_router

看文档描述,我理解的功能实现是提取出model的provider放到header中,如果model中没有provider,是不是无法操作后续流程了?

johnlanni commented 1 month ago

是的,现在设计的是根据 provider 做路由

ilovedumplings commented 1 month ago

嗯 我上面说的产品化,是基于这个插件能力,增加基于model参数匹配路由的能力,就是路由上可以配置匹配model参数中的provider

嗯嗯..我理解能力太差了

ilovedumplings commented 1 month ago

是的,现在设计的是根据 provider 做路由

请问下这个插件镜像可以用一下吗?

CH3CHO commented 1 month ago

是的,现在设计的是根据 provider 做路由

请问下这个插件镜像可以用一下吗?

试一下 oci://higress-registry.cn-hangzhou.cr.aliyuncs.com/plugins/model-router:latest

ilovedumplings commented 1 month ago

@CH3CHO @johnlanni 两位,经过我的测试,需要在全局的插件配置中配置model-route才能生效,我的使用场景是想要在同一个域名下根据model的不同进行路由,这个可以再缩一下范围吗?我的理解是,全局>域名>路由 全局的范围可能有点大

CH3CHO commented 1 month ago

在域名或者路由下面配置应该可以的呀。你现在是怎么配的,在域名或者路由下面配置不生效吗?

ilovedumplings commented 1 month ago

在域名或者路由下面配置应该可以的呀。你现在是怎么配的,在域名或者路由下面配置不生效吗?

我在域名下配置新增这个插件是不生效的,在全局上新增了这个插件是生效的

CH3CHO commented 1 month ago

实测确实有这个问题。使用以下配置时,发送请求至 home 路由,请求并不会经过 model-router。请 @johnlanni 看一下。

apiVersion: extensions.higress.io/v1alpha1
kind: WasmPlugin
metadata:
  annotations:
    higress.io/wasm-plugin-title: model-router
  creationTimestamp: "2024-10-25T05:18:46Z"
  generation: 4
  labels:
    higress.io/resource-definer: higress
    higress.io/wasm-plugin-built-in: "false"
    higress.io/wasm-plugin-category: custom
    higress.io/wasm-plugin-name: model-router
    higress.io/wasm-plugin-version: 1.0.0
  name: model-router-1.0.0
  namespace: higress-system
  resourceVersion: "777315"
  uid: 444ab503-a4d1-4c3f-bc70-c5f0c8f3f130
spec:
  defaultConfig:
    enable: true
  defaultConfigDisable: true
  matchRules:
  - config:
      enable: true
    configDisable: false
    ingress:
    - home
  phase: AUTHN
  priority: 1000
  url: oci://higress-registry.cn-hangzhou.cr.aliyuncs.com/plugins/model-router:latest
johnlanni commented 1 month ago

@CH3CHO @ilovedumplings 携带请求body,且content-type是json吗?只有满足这两个条件才会处理

CH3CHO commented 1 month ago

@CH3CHO @ilovedumplings 携带请求body,且content-type是json吗?只有满足这两个条件才会处理

已线下沟通

johnlanni commented 1 month ago

确实有问题,这个PR 修复了,latest镜像已经推送更新,关开下插件就会拉取最新的插件了

CH3CHO commented 1 month ago

@ilovedumplings 试一下看看

ilovedumplings commented 1 month ago

Reference in new

thanks

@ilovedumplings 试一下看看

好嘞

ilovedumplings commented 1 month ago

@CH3CHO @johnlanni 两位,当我启用根据model路由之后,发现我的缓存,key认证之类的插件都失效了 这是我model-route的配置

apiVersion: extensions.higress.io/v1alpha1
kind: WasmPlugin
metadata:
  annotations:
    higress.io/wasm-plugin-description: 拜托~
    higress.io/wasm-plugin-title: ai-model-route
  creationTimestamp: "2024-10-28T16:17:14Z"
  generation: 2
  labels:
    higress.io/resource-definer: higress
    higress.io/wasm-plugin-built-in: "false"
    higress.io/wasm-plugin-category: custom
    higress.io/wasm-plugin-name: ai-model-route
    higress.io/wasm-plugin-version: 1.0.0
  name: ai-model-route-1.0.0
  namespace: higress-system
  resourceVersion: "1838334"
  selfLink: /apis/extensions.higress.io/v1alpha1/namespaces/higress-system/wasmplugins/ai-model-route-1.0.0
  uid: c8e67b04-c9df-463c-bd5a-ed9177d5542d
spec:
  defaultConfig:
    enable: true
  defaultConfigDisable: false
  matchRules: []
  phase: UNSPECIFIED_PHASE
  priority: 390
  url: oci://higress-registry.cn-hangzhou.cr.aliyuncs.com/plugins/model-router:latest`

这是我key-auth得配置,针对路由164-qwen-model-route.com是失效了得。

[root@duoyunv1 ~]# kubectl -n higress-system get wasmPlugin key-auth-1.0.0 -o yaml

apiVersion: extensions.higress.io/v1alpha1
kind: WasmPlugin
metadata:
  annotations:
    higress.io/wasm-plugin-description: Authentication based on API Key.
    higress.io/wasm-plugin-icon: https://img.alicdn.com/imgextra/i4/O1CN01BPFGlT1pGZ2VDLgaH_!!6000000005333-2-tps-42-42.png
    higress.io/wasm-plugin-title: Key Auth
  creationTimestamp: "2024-10-25T08:44:22Z"
  generation: 48
  labels:
    higress.io/resource-definer: higress
    higress.io/wasm-plugin-built-in: "true"
    higress.io/wasm-plugin-category: auth
    higress.io/wasm-plugin-name: key-auth
    higress.io/wasm-plugin-version: 1.0.0
  name: key-auth-1.0.0
  namespace: higress-system
  resourceVersion: "1838611"
  selfLink: /apis/extensions.higress.io/v1alpha1/namespaces/higress-system/wasmplugins/key-auth-1.0.0
  uid: 7fa1010c-9543-4497-82e7-56acbcc876c5
spec:
  defaultConfig:
    consumers:
    - credential: 2bda943c-ba2b-11ec-ba07-00163e1250b5
      name: consumer1
    - credential: c8c8e9ca-558e-4a2d-bb62-e700dcc40e35
      name: consumer2
    - credential: 8bb8da69-5706-46c1-8357-94227e410b14
      name: 8bb8da69-5706-46c1-8357-94227e410b14
    - credential: 211cd836-791e-4309-abf5-da9cecc22429
      name: shigf-key-auth-1
    - credential: f9476dd9-e3a4-4ce0-92b6-4e077119ab5d
      name: f9476dd9-e3a4-4ce0-92b6-4e077119ab5d
    - credential: f041d84b-e49d-4972-bdc6-1869e3ae03d8
      name: shigf-key-auth-2
    global_auth: false
    in_header: true
    in_query: false
    keys:
    - apikey
  defaultConfigDisable: false
  matchRules:
  - config:
      allow:
      - shigf-key-auth-2
    configDisable: false
    ingress:
    - 164-qwen-model-route.com
  - config:
      allow:
      - shigf-key-auth-1
      - f9476dd9-e3a4-4ce0-92b6-4e077119ab5d
    configDisable: false
    ingress:
    - rout1-164-qwen-route1.com
  phase: AUTHN
  priority: 310
  url: oci://higress-registry.cn-hangzhou.cr.aliyuncs.com/plugins/key-auth:latest

请两位看下配置有什么问题没

johnlanni commented 1 month ago

164-qwen-model-route.com

@ilovedumplings 这个ingress的yaml也贴一下

johnlanni commented 1 month ago

知道原因了,因为model-router的执行阶段在key-auth之后导致的,你修改一下 phase 为 AUTHN, priority 再调大到 900 看下

ilovedumplings commented 1 month ago

知道原因了,因为model-router的执行阶段在key-auth之后导致的,你修改一下 phase 为 AUTHN, priority 再调大到 900 看下

不好意思,昨儿有上线,我刚试了一下,是没问题得,那么这个插件以后就这么配置么?

ilovedumplings commented 1 month ago

知道原因了,因为model-router的执行阶段在key-auth之后导致的,你修改一下 phase 为 AUTHN, priority 再调大到 900 看下

我还有个问题想请教一下,model-route得执行阶段在key-auth之后,那么不应该key-auth先生效么?可为啥key-auth是不生效的呢?

johnlanni commented 1 month ago

@ilovedumplings 因为 model-router 修改了请求头,之后重新路由才到你的目标路由上,没加请求头的时候的路由,是没有开启 key-auth的

johnlanni commented 2 weeks ago

@ilovedumplings 注意 model-router 插件目前还在beta阶段,之前提供的镜像请不要用于生产环境,后续还有不兼容的改动

johnlanni commented 2 weeks ago

如果要兼容上述 provider 从 model 中提取的逻辑,当前插件需要做以下配置:

addProviderHeader: x-higress-llm-provider