fixed ai-statistics plugin statistics error

pepesi commented 5 days ago

CLAassistant commented 5 days ago

All committers have signed the CLA.

johnlanni commented 4 days ago

cc @rinfx

johnlanni commented 3 days ago

@pepesi 是不是百川和智谱没有返回usage，有看过他们的api能否支持么，可以支持的话，最好扩展下ai proxy里的相关逻辑

pepesi commented 3 days ago

@pepesi 是不是百川和智谱没有返回usage，有看过他们的api能否支持么，可以支持的话，最好扩展下ai proxy里的相关逻辑

是返回了usage的，只是它在最后的一个是 Done，不是一个chunk对象。我目前只测试了baichuan 和zhipuai，其他的渠道还未测试过。

pepesi commented 3 days ago

baichuan

data: {"id":"chatcmpl-M6404016RK8MoIC","object":"chat.completion.chunk","created":1719490102,"model":"Baichuan4","choices":[{"index":0,"delta":{"role":"assistant","content":"Hello"}}]}

data:

data: {"id":"chatcmpl-M6404016RK8MoIC","object":"chat.completion.chunk","created":1719490102,"model":"Baichuan4","choices":[{"index":0,"delta":{"role":"assistant","content":"! How can I"}}]}

data:

data: {"id":"chatcmpl-M6404016RK8MoIC","object":"chat.completion.chunk","created":1719490103,"model":"Baichuan4","choices":[{"index":0,"delta":{"role":"assistant","content":" assist you today?"}}]}

data:

data: {"id":"chatcmpl-M6404016RK8MoIC","object":"chat.completion.chunk","created":1719490103,"model":"Baichuan4","choices":[{"index":0,"delta":{"role":"assistant","content":""},"finish_reason":"stop"}],"usage":{"prompt_tokens":3,"completion_tokens":10,"total_tokens":13}}

data:

data: [DONE]

data:

zhipuai

event:add
id:lang-to-lang-v4-1719490222731-141670
data:Hello! How can I assist you today? If you have any questions or need advice on a topic, feel free to ask.

event:finish
id:lang-to-lang-v4-1719490222731-141670
data:{"choices":[{"index":0,"finish_reason":"stop","delta":{"role":"assistant","content":""}}],"usage":{"prompt_tokens":29,"completion_tokens":28,"total_tokens":57},"request_id":null,"task_status":null,"created":1719490224,"model":"glm-4-0520","id":"8786554896418608324","error":null}

pepesi commented 3 days ago

在我最新的测试中，发现ai-token-ratelimit 插件似乎依赖了 ai-statistics 注入 filter_state的数据，需要沟通确认下是否是这么设计的。

johnlanni commented 2 days ago

cc @cr7258

cr7258 commented 1 day ago

@johnlanni ai-statistics 无法正常计数的原因和 @pepesi 说的一致，是由于最后一条消息是 [Done]，而不是一个 chunk 对象导致的。 ai-token-ratelimit 依赖了 ai-statistics 注入的 input_token 和 output_token 来进行限流。已经 review 并验证过 PR 中的代码，可以成功修复 ai-statistics 的 token 计数问题，并且简化了为 ai-token-ratelimit 设置 input_token 和 output_token 相关的重复代码。

ai-statistics 插件正常工作

istio-proxy@higress-gateway-659965d767-tnpwv:/$ curl -s http://localhost:15090/stats/prometheus |grep token | grep -E "baichuan|qwen"
# TYPE route_baichuan_upstream_outbound_443__baichuan_dns_model_Baichuan4_input_token counter
route_baichuan_upstream_outbound_443__baichuan_dns_model_Baichuan4_input_token{} 24
# TYPE route_baichuan_upstream_outbound_443__baichuan_dns_model_Baichuan4_output_token counter
route_baichuan_upstream_outbound_443__baichuan_dns_model_Baichuan4_output_token{} 110
# TYPE route_qwen_upstream_outbound_443__qwen_dns_model_qwen_turbo_input_token counter
route_qwen_upstream_outbound_443__qwen_dns_model_qwen_turbo_input_token{} 13
# TYPE route_qwen_upstream_outbound_443__qwen_dns_model_qwen_turbo_output_token counter
route_qwen_upstream_outbound_443__qwen_dns_model_qwen_turbo_output_token{} 33

ai-token-ratelimit 插件正常工作

kubectl port-forward -n higress-system svc/higress-gateway 18000:80
# 本地设置 /etc/host 
curl "http://baichuan-test.com:18000/v1/chat/completions?apikey=777777"  -H "Content-Type: application/json"  -d '{
  "model": "Baichuan4",
  "max_tokens": 1024,
  "messages": [
    {
      "role": "user",
      "content": "你好，你是谁？"
    }
  ],
  "stream": true
}' -i
HTTP/1.1 429 Too Many Requests
x-ratelimit-reset: 35
content-length: 17
content-type: text/plain
date: Sat, 29 Jun 2024 14:00:25 GMT
server: istio-envoy

Too many requests%

完整配置。

apiVersion: extensions.higress.io/v1alpha1
kind: WasmPlugin
metadata:
  name: ai-proxy
  namespace: higress-system
spec:
  matchRules:
  - config:
      provider:
        type: qwen
        apiTokens:
        - "<api-token>"
        modelMapping:
          'gpt-3': "qwen-turbo"
          'gpt-35-turbo': "qwen-plus"
          'gpt-4-turbo': "qwen-max"
          '*': "qwen-turbo"
    ingress:
    - qwen
  - config:
      provider:
        type: baichuan
        apiTokens:
        - "<api-token>"
    ingress:
    - baichuan
  url: oci://ghcr.io/cr7258/wasm-go-ai-proxy:v1.0.46
  phase: UNSPECIFIED_PHASE
  priority: 100
---
apiVersion: extensions.higress.io/v1alpha1
kind: WasmPlugin
metadata:
  name: ai-statistics
  namespace: higress-system
spec:
  defaultConfig:
    enable: true
  url: oci://ghcr.io/cr7258/wasm-go-ai-token-statistics:v1.0.47
  phase: UNSPECIFIED_PHASE
  priority: 200
---
apiVersion: extensions.higress.io/v1alpha1
kind: WasmPlugin
metadata:
  name: ai-token-ratelimit
  namespace: higress-system
spec:
  defaultConfig:
    rule_name: default_limit_by_param_apikey
    rule_items:
    - limit_by_param: apikey
      limit_keys:
      - key: 777777
        token_per_minute: 5
    redis:
      service_name: redis.static
      service_port: 6379
  url: oci://ghcr.io/cr7258/wasm-go-ai-token-ratelimit:v1.0.47
  phase: UNSPECIFIED_PHASE
  priority: 600
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  annotations:
    higress.io/backend-protocol: HTTPS
    higress.io/destination: qwen.dns
    higress.io/proxy-ssl-name: dashscope.aliyuncs.com
    higress.io/proxy-ssl-server-name: "on"
  labels:
    higress.io/resource-definer: higress
  name: qwen
  namespace: higress-system
spec:
  ingressClassName: higress
  rules:
  - host: qwen-test.com
    http:
      paths:
      - backend:
          resource:
            apiGroup: networking.higress.io
            kind: McpBridge
            name: default
        path: /
        pathType: Prefix
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  annotations:
    higress.io/backend-protocol: HTTPS
    higress.io/destination: baichuan.dns
    higress.io/proxy-ssl-name: api.baichuan-ai.com
    higress.io/proxy-ssl-server-name: "on"
  labels:
    higress.io/resource-definer: higress
  name: baichuan
  namespace: higress-system
spec:
  ingressClassName: higress
  rules:
  - host: baichuan-test.com
    http:
      paths:
      - backend:
          resource:
            apiGroup: networking.higress.io
            kind: McpBridge
            name: default
        path: /
        pathType: Prefix
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  annotations:
    higress.io/destination: redis.static
    higress.io/ignore-path-case: "false"
  labels:
    higress.io/resource-definer: higress
  name: redis.static
spec:
  ingressClassName: higress
  rules:
  - http:
      paths:
      - backend:
          resource:
            apiGroup: networking.higress.io
            kind: McpBridge
            name: default
        path: /
        pathType: Prefix
---
apiVersion: networking.higress.io/v1
kind: McpBridge
metadata:
  name: default
  namespace: higress-system
spec:
  registries:
  - domain: dashscope.aliyuncs.com
    name: qwen
    port: 443
    type: dns
  - domain: api.baichuan-ai.com
    name: baichuan
    port: 443
    type: dns
  - domain: 192.168.2.150:6379  # 本地起的 redis 服务
    name: redis
    type: static
    port: 6379

alibaba / higress

fixed ai-statistics plugin statistics error #1060