songquanpeng / one-api

OpenAI 接口管理 & 分发系统,支持 Azure、Anthropic Claude、Google PaLM 2 & Gemini、智谱 ChatGLM、百度文心一言、讯飞星火认知、阿里通义千问、360 智脑以及腾讯混元,可用于二次分发管理 key,仅单可执行文件,已打包好 Docker 镜像,一键部署,开箱即用. OpenAI key management & redistribution system, using a single API for all LLMs, and features an English UI.
https://openai.justsong.cn/
MIT License
16.68k stars 3.84k forks source link

MiniMax 渠道 abab6.5 模型未支持,缺失 `max_tokens` 参数情况下,流式/非流式回答被截断。 #1399

Open MurphyLo opened 2 months ago

MurphyLo commented 2 months ago

例行检查

问题描述

如题,使用 v0.6.6-alpha.14 版本,分别运行以下命令:

  1. 流式,不指定 max_tokens

    curl https://one-api-host/v1/chat/completions \
    -H "Content-Type: application/json" \
    -H "Authorization: Bearer sk-xxx" \
    -d '{
    "model": "abab6.5-chat",
    "messages": [{"role": "user", "content": "用两句话夸我帅"}],
    "temperature": 0.2,
    "stream": true
    }'
  2. 流式,max_tokens=4000

    curl https://one-api-host/v1/chat/completions \
    -H "Content-Type: application/json" \
    -H "Authorization: Bearer sk-xxx" \
    -d '{
    "model": "abab6.5-chat",
    "messages": [{"role": "user", "content": "用两句话夸我帅"}],
    "temperature": 0.2,
    "max_tokens": 4000,
    "stream": true
    }'
  3. 非流,不指定 max_tokens

    curl https://one-api-host/v1/chat/completions \
    -H "Content-Type: application/json" \
    -H "Authorization: Bearer sk-xxx" \
    -d '{
    "model": "abab6.5-chat",
    "messages": [{"role": "user", "content": "用两句话夸我帅"}],
    "temperature": 0.2,
    "stream": false
    }'
  4. 非流,max_tokens=4000

    curl https://one-api-host/v1/chat/completions \
    -H "Content-Type: application/json" \
    -H "Authorization: Bearer sk-xxx" \
    -d '{
    "model": "abab6.5-chat",
    "messages": [{"role": "user", "content": "用两句话夸我帅"}],
    "temperature": 0.2,
    "max_tokens": 4000,
    "stream": false
    }'

依次有如下响应:

  1. 流式,不指定 max_tokens
    
    data: {"id":"xxx","choices":[{"index":0,"delta":{"content":"你","role":"assistant"}}],"created":1714458858,"model":"abab6.5-chat","object":"chat.completion.chunk"}

data: {"id":"xxx","choices":[{"finish_reason":"length","index":0,"delta":{"content":"不仅外表英俊","role":"assistant"}}],"created":1714458858,"model":"abab6.5-chat","object":"chat.completion.chunk"}

data: {"id":"xxx","choices":[{"finish_reason":"length","index":0,"message":{"content":"你不仅外表英俊","role":"assistant"}}],"created":1714458858,"model":"abab6.5-chat","object":"chat.completion","usage":{"total_tokens":76},"base_resp":{"status_code":0,"status_msg":""}}


2. 流式,max_tokens=4000

data: {"id":"xxx","choices":[{"index":0,"delta":{"content":"你","role":"assistant"}}],"created":1714458504,"model":"abab6.5-chat","object":"chat.completion.chunk"}

data: {"id":"xxx","choices":[{"index":0,"delta":{"content":"拥有着令人印象深刻的英俊外表,每一个微笑都像是阳光下的璀璨星辰,让人无法移开视线。你的帅气不仅仅是外表,更是一种由内而外散发","role":"assistant"}}],"created":1714458505,"model":"abab6.5-chat","object":"chat.completion.chunk"}

data: {"id":"xxx","choices":[{"finish_reason":"stop","index":0,"delta":{"content":"的自信和魅力,让人不禁赞叹。","role":"assistant"}}],"created":1714458505,"model":"abab6.5-chat","object":"chat.completion.chunk"}

data: {"id":"xxx","choices":[{"finish_reason":"stop","index":0,"message":{"content":"你拥有着令人印象深刻的英俊外表,每一个微笑都像是阳光下的璀璨星辰,让 人无法移开视线。你的帅气不仅仅是外表,更是一种由内而外散发的自信和魅力,让人不禁赞叹。","role":"assistant"}}],"created":1714458505,"model":"abab6.5-chat","object":"chat.completion","usage":{"total_tokens":117},"base_resp":{"status_code":0,"status_msg":""}}


3. 非流,不指定 `max_tokens`

{"id":"xxx","choices":[{"finish_reason":"length","index":0,"message":{"content":"你不仅外表英俊","role":"assistant"}}],"created":1714458902,"model":"abab6.5-chat","object":"chat.completion","usage":{"total_tokens":76},"base_resp":{"status_code":0,"status_msg":""}}


4. 非流,max_tokens=4000

{"id":"xxx","choices":[{"finish_reason":"stop","index":0,"message":{"content":"你拥有令人瞩目的英俊外表,每一次出现都像是从时尚杂志中走出来的模特。你的帅 气不仅仅是外表,更是一种由内而外散发的独特魅力,让人无法忽视。","role":"assistant"}}],"created":1714458942,"model":"abab6.5-chat","object":"chat.completion","usage":{"total_tokens":111},"base_resp":{"status_code":0,"status_msg":""}}

QAbot-zh commented 2 months ago

补充:除了回答被截断外,新模型的计费也有问题(可能是因为token计算器的不同,minimax 比 gpt 省 token,导致返回响应中的 total_token 小于 gpt3.5 tiktoken 算出来的 prompt_token 导致的)

image

songquanpeng commented 2 months ago

有点搞,补全是根据total反算出来的,这下有点麻烦

lem0ke commented 2 months ago

有点搞,补全是根据total反算出来的,这下有点麻烦

minimax人员回复可以根据respond返回信息确认token字符数,不知道这个信息是否对你有帮助

QAbot-zh commented 2 months ago

看了下 minimax 输入输出价格一致,所以至少消耗日志上不会出统计差错。如果想避免出现复数token,也可以统计提示和补全的字符数比例然后根据 total 算出一个值来,这样比较兼顾。