alibaba / higress

Cloud Native API Gateway | 云原生API网关
https://higress.io
Apache License 2.0
2.5k stars 407 forks source link

feat: support minimax ai model #1033

Closed hanxiantao closed 3 weeks ago

hanxiantao commented 3 weeks ago

Ⅰ. Describe what this PR did

1)支持minimax AI模型

2)修复文心一言使用OpenAI协议流式响应格式(data:后少了个空格)

Ⅱ. Does this pull request fix one issue?

fixes https://github.com/alibaba/higress/issues/953

Ⅲ. Why don't you add test cases (unit test/integration test)?

Ⅳ. Describe how to verify it

docker-compose.yaml

version: '3.7'
services:
  envoy:
    image: higress-registry.cn-hangzhou.cr.aliyuncs.com/higress/gateway:1.4.0-rc.1
    entrypoint: /usr/local/bin/envoy
    # 注意这里对wasm开启了debug级别日志,正式部署时则默认info级别
    command: -c /etc/envoy/envoy.yaml --component-log-level wasm:debug
    depends_on:
      - httpbin
    networks:
      - wasmtest
    ports:
      - "10000:10000"
    volumes:
      - ./envoy.yaml:/etc/envoy/envoy.yaml
      - ./plugin.wasm:/etc/envoy/plugin.wasm
  httpbin:
    image: kennethreitz/httpbin:latest
    networks:
      - wasmtest
    ports:
      - "12345:80"
networks:
  wasmtest: {}

使用OpenAI协议

envoy.yaml

admin:
  address:
    socket_address:
      protocol: TCP
      address: 0.0.0.0
      port_value: 9901
static_resources:
  listeners:
    - name: listener_0
      address:
        socket_address:
          protocol: TCP
          address: 0.0.0.0
          port_value: 10000
      filter_chains:
        - filters:
            - name: envoy.filters.network.http_connection_manager
              typed_config:
                "@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
                scheme_header_transformation:
                  scheme_to_overwrite: https
                stat_prefix: ingress_http
                # Output envoy logs to stdout
                access_log:
                  - name: envoy.access_loggers.stdout
                    typed_config:
                      "@type": type.googleapis.com/envoy.extensions.access_loggers.stream.v3.StdoutAccessLog
                # Modify as required
                route_config:
                  name: local_route
                  virtual_hosts:
                    - name: local_service
                      domains: [ "*" ]
                      routes:
                        - match:
                            prefix: "/"
                          route:
                            cluster: minimax
                            timeout: 300s
                http_filters:
                  - name: wasmtest
                    typed_config:
                      "@type": type.googleapis.com/udpa.type.v1.TypedStruct
                      type_url: type.googleapis.com/envoy.extensions.filters.http.wasm.v3.Wasm
                      value:
                        config:
                          name: wasmtest
                          vm_config:
                            runtime: envoy.wasm.runtime.v8
                            code:
                              local:
                                filename: /etc/envoy/plugin.wasm
                          configuration:
                            "@type": "type.googleapis.com/google.protobuf.StringValue"
                            value: |
                              {
                                  "provider": {
                                    "type": "minimax",
                                    "apiTokens": [
                                      "YOUR_MINIMAX_API_TOKEN"
                                    ],
                                    "modelMapping": {
                                      "gpt-3": "abab6.5g-chat",
                                      "gpt-4": "abab6.5-chat",
                                      "*": "abab6.5g-chat"
                                    },
                                    "protocol": "openai",
                                    "minimaxGroupId": "YOUR_MINIMAX_GROUP_ID"
                                  }
                              }
                  - name: envoy.filters.http.router
  clusters:
    - name: httpbin
      connect_timeout: 30s
      type: LOGICAL_DNS
      # Comment out the following line to test on v6 networks
      dns_lookup_family: V4_ONLY
      lb_policy: ROUND_ROBIN
      load_assignment:
        cluster_name: httpbin
        endpoints:
          - lb_endpoints:
              - endpoint:
                  address:
                    socket_address:
                      address: httpbin
                      port_value: 80
    - name: minimax
      connect_timeout: 30s
      type: LOGICAL_DNS
      dns_lookup_family: V4_ONLY
      lb_policy: ROUND_ROBIN
      load_assignment:
        cluster_name: minimax
        endpoints:
          - lb_endpoints:
              - endpoint:
                  address:
                    socket_address:
                      address: api.minimax.chat
                      port_value: 443
      transport_socket:
        name: envoy.transport_sockets.tls
        typed_config:
          "@type": type.googleapis.com/envoy.extensions.transport_sockets.tls.v3.UpstreamTlsContext
          "sni": "api.minimax.chat"

非流式请求

示例1:调用ChatCompletion V2接口

curl -X POST 'http://localhost:10000/v1/chat/completions' \
-H 'Content-Type: application/json' \
-d '{
    "model": "gpt-3",
    "messages": [
        {
            "role": "user",
            "content": "你好,你是谁?"
        }
    ],
    "stream": false
}'

响应:

{
    "id": "02b459a17dba50e97f7a315e98566796",
    "choices": [
        {
            "finish_reason": "stop",
            "index": 0,
            "message": {
                "content": "我是一个使用AI技术进行语言交互的软件。我是 MiniMax 的产品,MiniMax是中国的一家科技公司。MiniMax一直致力于大模型的研究,而我则是 MiniMax 研发的最新产品。我能够回答你的各种问题,也可以根据你的要求帮助你完成一些简单的任务。总之,我希望我能帮助你解决你遇到的问题。那么,你有什么需要帮助的吗?",
                "role": "assistant"
            }
        }
    ],
    "created": 1717905059,
    "model": "abab6.5g-chat",
    "object": "chat.completion",
    "usage": {
        "total_tokens": 154
    },
    "input_sensitive": false,
    "output_sensitive": false,
    "input_sensitive_type": 0,
    "output_sensitive_type": 0,
    "base_resp": {
        "status_code": 0,
        "status_msg": ""
    }
}

使用OpenAI协议非流式请求

示例2:调用ChatCompletion Pro接口

curl -X POST 'http://localhost:10000/v1/chat/completions' \
-H 'Content-Type: application/json' \
-d '{
    "model": "gpt-4",
    "messages": [
        {
            "role": "user",
            "content": "你好,你是谁?"
        }
    ],
    "stream": false
}'

响应:

{
    "id": "02b45a232f54be51749272e3f3807f52",
    "choices": [
        {
            "index": 0,
            "message": {
                "name": "MM智能助理",
                "role": "assistant",
                "content": "你好!我是MM智能助理,一款由MiniMax公司自主研发的大型语言模型。我可以帮助你解答问题、提供信息、进行对话和执行多种语言处理任务。如果你有任何问题或需要帮助,请随时告诉我!"
            },
            "finish_reason": "stop"
        }
    ],
    "created": 1717905189,
    "model": "abab6.5-chat",
    "object": "chat.completion",
    "usage": {
        "total_tokens": 116
    }
}

使用OpenAI协议非流式请求2

流式请求

示例1:调用ChatCompletion V2接口

curl -X POST 'http://localhost:10000/v1/chat/completions' \
-H 'Content-Type: application/json' \
-d '{
    "model": "gpt-3",
    "messages": [
        {
            "role": "user",
            "content": "你好,你是谁?"
        }
    ],
    "stream": true
}'

响应:

data: {"id":"02b45aa28acd74677c8f2deba51a286d","choices":[{"index":0,"delta":{"content":"你好","role":"assistant"}}],"created":1717905314,"model":"abab6.5g-chat","object":"chat.completion.chunk","output_sensitive":false,"input_sensitive_type":0,"output_sensitive_type":0}

data: {"id":"02b45aa28acd74677c8f2deba51a286d","choices":[{"finish_reason":"stop","index":0,"delta":{"content":",我是MM智能助理。我是一个由MiniMax自研的大型语言模型。我拥有超过1,300亿个参数,可以回答各种问题。我可以帮助你解决各种问题。请问有什么需要帮助的吗?","role":"assistant"}}],"created":1717905315,"model":"abab6.5g-chat","object":"chat.completion.chunk","output_sensitive":false,"input_sensitive_type":0,"output_sensitive_type":0}

data: {"id":"02b45aa28acd74677c8f2deba51a286d","choices":[{"finish_reason":"stop","index":0,"message":{"content":"你好,我是MM智能助理。我是一个由MiniMax自研的大型语言模型。我拥有超过1,300亿个参数,可以回答各种问题。我可以帮助你解决各种问题。请问有什么需要帮助的吗?","role":"assistant"}}],"created":1717905315,"model":"abab6.5g-chat","object":"chat.completion","usage":{"total_tokens":120},"input_sensitive":false,"output_sensitive":false,"input_sensitive_type":0,"output_sensitive_type":0,"base_resp":{"status_code":0,"status_msg":""}}

使用OpenAI协议流式请求

示例2:调用ChatCompletion Pro接口

curl -X POST 'http://localhost:10000/v1/chat/completions' \
-H 'Content-Type: application/json' \
-d '{
    "model": "gpt-4",
    "messages": [
        {
            "role": "user",
            "content": "你好,你是谁?"
        }
    ],
    "stream": true
}'

响应:

data: {"choices":[{"index":0,"message":{"name":"MM智能助理","role":"assistant","content":"你好"}}],"created":1717905388,"model":"abab6.5-chat","object":"chat.completion","usage":{}}

data: {"choices":[{"index":0,"message":{"name":"MM智能助理","role":"assistant","content":"!我是MM智能助理,一款由MiniMax公司自主研发的大型语言模型。我可以帮助你解答问题、提供信息、进行对话和执行"}}],"created":1717905389,"model":"abab6.5-chat","object":"chat.completion","usage":{}}

data: {"choices":[{"index":0,"message":{"name":"MM智能助理","role":"assistant","content":"多种语言处理任务。如果你有任何问题或需要帮助,请随时告诉我!"}}],"created":1717905390,"model":"abab6.5-chat","object":"chat.completion","usage":{}}

data: {"id":"02b45aebe6101dd3cc5028a5b1dae0f3","choices":[{"index":0,"message":{"name":"MM智能助理","role":"assistant","content":"你好!我是MM智能助理,一款由MiniMax公司自主研发的大型语言模型。我可以帮助你解答问题、提供信息、进行对话和执行多种语言处理任务。如果你有任何问题或需要帮助,请随时告诉我!"},"finish_reason":"stop"}],"created":1717905390,"model":"abab6.5-chat","object":"chat.completion","usage":{"total_tokens":116}}

使用OpenAI协议流式请求2

使用MiniMax协议

envoy.yaml

admin:
  address:
    socket_address:
      protocol: TCP
      address: 0.0.0.0
      port_value: 9901
static_resources:
  listeners:
    - name: listener_0
      address:
        socket_address:
          protocol: TCP
          address: 0.0.0.0
          port_value: 10000
      filter_chains:
        - filters:
            - name: envoy.filters.network.http_connection_manager
              typed_config:
                "@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
                scheme_header_transformation:
                  scheme_to_overwrite: https
                stat_prefix: ingress_http
                # Output envoy logs to stdout
                access_log:
                  - name: envoy.access_loggers.stdout
                    typed_config:
                      "@type": type.googleapis.com/envoy.extensions.access_loggers.stream.v3.StdoutAccessLog
                # Modify as required
                route_config:
                  name: local_route
                  virtual_hosts:
                    - name: local_service
                      domains: [ "*" ]
                      routes:
                        - match:
                            prefix: "/"
                          route:
                            cluster: minimax
                            timeout: 300s
                http_filters:
                  - name: wasmtest
                    typed_config:
                      "@type": type.googleapis.com/udpa.type.v1.TypedStruct
                      type_url: type.googleapis.com/envoy.extensions.filters.http.wasm.v3.Wasm
                      value:
                        config:
                          name: wasmtest
                          vm_config:
                            runtime: envoy.wasm.runtime.v8
                            code:
                              local:
                                filename: /etc/envoy/plugin.wasm
                          configuration:
                            "@type": "type.googleapis.com/google.protobuf.StringValue"
                            value: |
                              {
                                  "provider": {
                                    "type": "minimax",
                                    "apiTokens": [
                                      "YOUR_MINIMAX_API_TOKEN"
                                    ],
                                    "protocol": "original"
                                  }
                              }
                  - name: envoy.filters.http.router
  clusters:
    - name: httpbin
      connect_timeout: 30s
      type: LOGICAL_DNS
      # Comment out the following line to test on v6 networks
      dns_lookup_family: V4_ONLY
      lb_policy: ROUND_ROBIN
      load_assignment:
        cluster_name: httpbin
        endpoints:
          - lb_endpoints:
              - endpoint:
                  address:
                    socket_address:
                      address: httpbin
                      port_value: 80
    - name: minimax
      connect_timeout: 30s
      type: LOGICAL_DNS
      dns_lookup_family: V4_ONLY
      lb_policy: ROUND_ROBIN
      load_assignment:
        cluster_name: minimax
        endpoints:
          - lb_endpoints:
              - endpoint:
                  address:
                    socket_address:
                      address: api.minimax.chat
                      port_value: 443
      transport_socket:
        name: envoy.transport_sockets.tls
        typed_config:
          "@type": type.googleapis.com/envoy.extensions.transport_sockets.tls.v3.UpstreamTlsContext
          "sni": "api.minimax.chat"

非流式请求

示例1:调用ChatCompletion V2接口

curl -X POST 'http://localhost:10000/v1/chat/completions' \
-H 'Content-Type: application/json' \
-d '{
    "model": "abab6.5g-chat",
    "messages": [
        {
            "role": "user",
            "content": "你好,你是谁?"
        }
    ],
    "stream": false
}'

响应:

{
    "id": "02b45d3ae75a857392ea089f4d376827",
    "choices": [
        {
            "finish_reason": "stop",
            "index": 0,
            "message": {
                "content": "MM智能助理:我叫MM智能助理,我是由MiniMax公司研发的智能助理,可以为用户提供多种智能服务。我可以通过自然语言理解(NLU)和自然语言生成(NLG)来理解用户的问题并提供相应的解决方案,还可以根据用户的要求进行个性化的定制。",
                "role": "assistant"
            }
        }
    ],
    "created": 1717905980,
    "model": "abab6.5g-chat",
    "object": "chat.completion",
    "usage": {
        "total_tokens": 132
    },
    "input_sensitive": false,
    "output_sensitive": false,
    "input_sensitive_type": 0,
    "output_sensitive_type": 0,
    "base_resp": {
        "status_code": 0,
        "status_msg": ""
    }
}

使用MiniMax协议非流式请求2

示例2:调用ChatCompletion Pro接口

curl -X POST 'http://localhost:10000/v1/chat/completions' \
-H 'Content-Type: application/json' \
-d '{
    "bot_setting": [
        {
            "bot_name": "MM智能助理",
            "content": "MM智能助理是一款由MiniMax自研的,没有调用其他产品的接口的大型语言模型。MiniMax是一家中国科技公司,一直致力于进行大模型相关的研究。"
        }
    ],
    "messages": [
        {
            "sender_type": "USER",
            "sender_name": "小明",
            "text": "你好,你是谁?"
        }
    ],
    "reply_constraints": {
        "sender_type": "BOT",
        "sender_name": "MM智能助理"
    },
    "model": "abab6.5s-chat",
    "tokens_to_generate": 2048,
    "temperature": 0.01,
    "top_p": 0.95,
    "stream": false
}'

响应:

{
    "created": 1717905759,
    "model": "abab6.5s-chat",
    "reply": "你好!我是MM智能助理,一款由MiniMax公司自研的大型语言模型。我可以帮助回答问题、提供信息、进行对话和执行多种语言处理任务。如果你有任何问题或需要帮助,请随时告诉我!",
    "choices": [
        {
            "finish_reason": "stop",
            "messages": [
                {
                    "sender_type": "BOT",
                    "sender_name": "MM智能助理",
                    "text": "你好!我是MM智能助理,一款由MiniMax公司自研的大型语言模型。我可以帮助回答问题、提供信息、进行对话和执行多种语言处理任务。如果你有任何问题或需要帮助,请随时告诉我!"
                }
            ]
        }
    ],
    "usage": {
        "total_tokens": 116
    },
    "input_sensitive": false,
    "output_sensitive": false,
    "id": "02b45c5d4d3be74655c8d5335188e568",
    "base_resp": {
        "status_code": 0,
        "status_msg": ""
    }
}

使用MiniMax协议非流式请求

流式请求

示例1:调用ChatCompletion V2接口

curl -X POST 'http://localhost:10000/v1/chat/completions' \
-H 'Content-Type: application/json' \
-d '{
    "model": "abab6.5g-chat",
    "messages": [
        {
            "role": "user",
            "content": "你好,你是谁?"
        }
    ],
    "stream": true
}'

响应:

data: {"id":"02b45d9806cb2d7760a906a8590f4a4f","choices":[{"index":0,"delta":{"content":"你好","role":"assistant"}}],"created":1717906073,"model":"abab6.5g-chat","object":"chat.completion.chunk","output_sensitive":false,"input_sensitive_type":0,"output_sensitive_type":0}

data: {"id":"02b45d9806cb2d7760a906a8590f4a4f","choices":[{"index":0,"delta":{"content":",我的名字是MM智能助理,是一款由中国科技公司MiniMax研发的大模型产品。我能够处理自然语言信息,回答各种问题,同时我也可以和您聊天。您有任何问题都可以","role":"assistant"}}],"created":1717906074,"model":"abab6.5g-chat","object":"chat.completion.chunk","output_sensitive":false,"input_sensitive_type":0,"output_sensitive_type":0}

data: {"id":"02b45d9806cb2d7760a906a8590f4a4f","choices":[{"finish_reason":"stop","index":0,"delta":{"content":"向我咨询,我会尽我的能力帮助您。","role":"assistant"}}],"created":1717906074,"model":"abab6.5g-chat","object":"chat.completion.chunk","output_sensitive":false,"input_sensitive_type":0,"output_sensitive_type":0}

data: {"id":"02b45d9806cb2d7760a906a8590f4a4f","choices":[{"finish_reason":"stop","index":0,"message":{"content":"你好,我的名字是MM智能助理,是一款由中国科技公司MiniMax研发的大模型产品。我能够处理自然语言信息,回答各种问题,同时我也可以和您聊天。您有任何问题都可以向我咨询,我会尽我的能力帮助您。","role":"assistant"}}],"created":1717906074,"model":"abab6.5g-chat","object":"chat.completion","usage":{"total_tokens":123},"input_sensitive":false,"output_sensitive":false,"input_sensitive_type":0,"output_sensitive_type":0,"base_resp":{"status_code":0,"status_msg":""}}

使用MiniMax协议流式请求2

示例2:调用ChatCompletion Pro接口

curl -X POST 'http://localhost:10000/v1/chat/completions' \
-H 'Content-Type: application/json' \
-d '{
    "bot_setting": [
        {
            "bot_name": "MM智能助理",
            "content": "MM智能助理是一款由MiniMax自研的,没有调用其他产品的接口的大型语言模型。MiniMax是一家中国科技公司,一直致力于进行大模型相关的研究。"
        }
    ],
    "messages": [
        {
            "sender_type": "USER",
            "sender_name": "小明",
            "text": "你好,你是谁?"
        }
    ],
    "reply_constraints": {
        "sender_type": "BOT",
        "sender_name": "MM智能助理"
    },
    "model": "abab6.5s-chat",
    "tokens_to_generate": 2048,
    "temperature": 0.01,
    "top_p": 0.95,
    "stream": true
}'

响应:

data: {"created":1717905811,"model":"abab6.5s-chat","reply":"","choices":[{"messages":[{"sender_type":"BOT","sender_name":"MM智能助理","text":"你好"}]}],"output_sensitive":false,"request_id":"YOUR_MINIMAX_GROUP_ID_1717905810674735"}

data: {"created":1717905812,"model":"abab6.5s-chat","reply":"","choices":[{"messages":[{"sender_type":"BOT","sender_name":"MM智能助理","text":"!我是MM智能助理,一款由MiniMax公司自研的大型语言模型。我可以帮助回答问题、提供信息、进行对话和执行多种语言处理任务。如果你有任何问题或需要帮助,请随时告诉我!"}]}],"output_sensitive":false,"request_id":"YOUR_MINIMAX_GROUP_ID_1717905810674735"}

data: {"created":1717905812,"model":"abab6.5s-chat","reply":"你好!我是MM智能助理,一款由MiniMax公司自研的大型语言模型。我可以帮助回答问题、提供信息、进行对话和执行多种语言处理任务。如果你有任何问题或需要帮助,请随时告诉我!","choices":[{"finish_reason":"stop","messages":[{"sender_type":"BOT","sender_name":"MM智能助理","text":"你好!我是MM智能助理,一款由MiniMax公司自研的大型语言模型。我可以帮助回答问题、提供信息、进行对话和执行多种语言处理任务。如果你有任何问题或需要帮助,请随时告诉我!"}]}],"usage":{"total_tokens":116},"input_sensitive":false,"output_sensitive":false,"id":"02b45c92e213b1cfa4b7e0e1ab4cc3b5","base_resp":{"status_code":0,"status_msg":""}}

使用MiniMax协议流式请求

修复文心一言使用OpenAI协议流式响应格式

envoy.yaml

admin:
  address:
    socket_address:
      protocol: TCP
      address: 0.0.0.0
      port_value: 9901
static_resources:
  listeners:
    - name: listener_0
      address:
        socket_address:
          protocol: TCP
          address: 0.0.0.0
          port_value: 10000
      filter_chains:
        - filters:
            - name: envoy.filters.network.http_connection_manager
              typed_config:
                "@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
                scheme_header_transformation:
                  scheme_to_overwrite: https
                stat_prefix: ingress_http
                # Output envoy logs to stdout
                access_log:
                  - name: envoy.access_loggers.stdout
                    typed_config:
                      "@type": type.googleapis.com/envoy.extensions.access_loggers.stream.v3.StdoutAccessLog
                # Modify as required
                route_config:
                  name: local_route
                  virtual_hosts:
                    - name: local_service
                      domains: [ "*" ]
                      routes:
                        - match:
                            prefix: "/"
                          route:
                            cluster: baidu
                            timeout: 300s
                http_filters:
                  - name: wasmtest
                    typed_config:
                      "@type": type.googleapis.com/udpa.type.v1.TypedStruct
                      type_url: type.googleapis.com/envoy.extensions.filters.http.wasm.v3.Wasm
                      value:
                        config:
                          name: wasmtest
                          vm_config:
                            runtime: envoy.wasm.runtime.v8
                            code:
                              local:
                                filename: /etc/envoy/plugin.wasm
                          configuration:
                            "@type": "type.googleapis.com/google.protobuf.StringValue"
                            value: |
                              {
                                  "provider": {
                                    "type": "baidu",
                                    "apiTokens": [
                                      "YOUR_BAIDU_API_TOKEN"
                                    ],
                                    "modelMapping": {
                                      "gpt-3": "ERNIE-4.0-8K",
                                      "*": "ERNIE-4.0-8K"
                                    },
                                    "protocol": "openai"
                                  }
                              }
                  - name: envoy.filters.http.router
  clusters:
    - name: httpbin
      connect_timeout: 30s
      type: LOGICAL_DNS
      # Comment out the following line to test on v6 networks
      dns_lookup_family: V4_ONLY
      lb_policy: ROUND_ROBIN
      load_assignment:
        cluster_name: httpbin
        endpoints:
          - lb_endpoints:
              - endpoint:
                  address:
                    socket_address:
                      address: httpbin
                      port_value: 80
    - name: baidu
      connect_timeout: 30s
      type: LOGICAL_DNS
      dns_lookup_family: V4_ONLY
      lb_policy: ROUND_ROBIN
      load_assignment:
        cluster_name: baidu
        endpoints:
          - lb_endpoints:
              - endpoint:
                  address:
                    socket_address:
                      address: aip.baidubce.com
                      port_value: 443
      transport_socket:
        name: envoy.transport_sockets.tls
        typed_config:
          "@type": type.googleapis.com/envoy.extensions.transport_sockets.tls.v3.UpstreamTlsContext
          "sni": "aip.baidubce.com"

流式请求

curl -X POST 'http://localhost:10000/v1/chat/completions' \
-H 'Content-Type: application/json' \
-d '{
    "model": "gpt-3",
    "messages": [
        {
            "role": "user",
            "content": "你好,你是谁?"
        }
    ],
    "stream": true
}'

响应:

data: {"id":"as-e8yq69nwwe","choices":[{"index":0,"message":{"role":"assistant","content":"你好,"}}],"created":1717765832,"model":"ERNIE-4.0-8K","object":"chat.completion","usage":{"prompt_tokens":4,"total_tokens":4}}

data: {"id":"as-e8yq69nwwe","choices":[{"index":0,"message":{"role":"assistant","content":"我是文心一言,可以协助你完成范围广泛的任务并提供有关各种主题的信息,比如回答问题,提供定义和解释及建议。"}}],"created":1717765834,"model":"ERNIE-4.0-8K","object":"chat.completion","usage":{"prompt_tokens":4,"total_tokens":4}}

data: {"id":"as-e8yq69nwwe","choices":[{"index":0,"message":{"role":"assistant","content":"如果你有任何问题,请随时向我提问。"}}],"created":1717765835,"model":"ERNIE-4.0-8K","object":"chat.completion","usage":{"prompt_tokens":4,"total_tokens":4}}

修复文心一言使用OpenAI协议流式响应格式

Ⅴ. Special notes for reviews

hanxiantao commented 3 weeks ago

我看 MiniMax 有三个 Chat Completion 接口。现在的实现里用的是 V2。这样做有什么原因吗?

image

只有ChatCompletion v2是支持所有模型的,ChatCompletion Pro仅用于abab6.5、abab6.5s、abab5.5s模型(推荐优先使用),ChatCompletion仅用于abab5.5、abab5.5s模型(推荐优先使用),我考虑再针对minimax加一个特殊的字段共用户选择调用哪个接口,更合适一点

CH3CHO commented 3 weeks ago

我看 MiniMax 有三个 Chat Completion 接口。现在的实现里用的是 V2。这样做有什么原因吗? image

只有ChatCompletion v2是支持所有模型的,ChatCompletion Pro仅用于abab6.5、abab6.5s、abab5.5s模型(推荐优先使用),ChatCompletion仅用于abab5.5、abab5.5s模型(推荐优先使用),我考虑再针对minimax加一个特殊的字段共用户选择调用哪个接口,更合适一点

或者根据模型自动选择接口呢?

hanxiantao commented 3 weeks ago

我看 MiniMax 有三个 Chat Completion 接口。现在的实现里用的是 V2。这样做有什么原因吗? image

只有ChatCompletion v2是支持所有模型的,ChatCompletion Pro仅用于abab6.5、abab6.5s、abab5.5s模型(推荐优先使用),ChatCompletion仅用于abab5.5、abab5.5s模型(推荐优先使用),我考虑再针对minimax加一个特殊的字段共用户选择调用哪个接口,更合适一点

或者根据模型自动选择接口呢?

也可以,如果是abab6.5、abab6.5s、abab5.5s模型会优先使用ChatCompletion Pro,abab5.5优先使用ChatCompletion,其他模型使用ChatCompletion v2,我这边会根据这个逻辑再调整下

hanxiantao commented 3 weeks ago

如果是abab6.5、abab6.5s、abab5.5s模型会优先使用ChatCompletion Pro,abab5.5优先使用ChatCompletion,其他模型使用ChatCompletion v2,我这边会根据这个逻辑再调整下

ChatCompletion Pro也支持abab5.5,目前实现逻辑:如果是abab6.5、abab6.5s、abab5.5s、abab5.5模型会优先使用ChatCompletion Pro,其他模型使用ChatCompletion v2(abab6.5t、abab6.5g)