xorbitsai / inference

Replace OpenAI GPT with another LLM in your app by changing a single line of code. Xinference gives you the freedom to use any LLM you need. With Xinference, you're empowered to run inference with any open-source language models, speech recognition models, and multimodal models, whether in the cloud, on-premises, or even on your laptop.
https://inference.readthedocs.io
Apache License 2.0
4.65k stars 364 forks source link

QUESTION: GLM3-6b的function call报错 #958

Closed ChiayenGu closed 3 weeks ago

ChiayenGu commented 7 months ago

版本:0.8.2 cuda:12.2

问题复现: 使用pytorch启动glm3-6b,向/v1/chat/completions接口发送消息

{
  "model": "chatglm3",
  "temperature": 0,
  "messages": [
    {
      "role": "user",
      "content": "请你帮助我使用工具将用户的问题进行分类,请注意,不需要你回答问题。\n需要分类的问题是: `\"中东局势怎么样了\"`"
    }
  ],
  "tools": [
    {
      "type": "function",
      "function": {
        "name": "classify_question",
        "description": "根据对话记录及补充的背景知识,对问题进行分类,并返回对应的类型字段",
        "parameters": {
          "type": "object",
          "properties": {
            "type": {
              "type": "string",
              "description": "问题类型。下面是几种可选的问题类型: 关于军事新闻、时事政治、国际组织的问题,返回:'wqre';关于国务院的问题,返回:'sdfa';用户想要查询日程,返回:'oy1c';其他问题,返回:'86lk'",
              "enum": [
                "wqre",
                "sdfa",
                "oy1c",
                "86lk"
              ]
            }
          },
          "required": [
            "type"
          ]
        }
      }
    }
  ],
  "tool_choice": {
    "type": "function",
    "function": {
      "name": "classify_question"
    }
  }
}

报错信息:

2024-01-31 17:17:39,690 xinference.api.restful_api 31719 ERROR    [address=0.0.0.0:42925, pid=69006] 0
Traceback (most recent call last):
  File "/home/cetc15/anaconda3/envs/glm3/lib/python3.11/site-packages/xinference/api/restful_api.py", line 1207, in create_chat_completion
    data = await model.chat(prompt, system_prompt, chat_history, kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/cetc15/anaconda3/envs/glm3/lib/python3.11/site-packages/xoscar/backends/context.py", line 227, in send
    return self._process_result_message(result)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/cetc15/anaconda3/envs/glm3/lib/python3.11/site-packages/xoscar/backends/context.py", line 102, in _process_result_message
    raise message.as_instanceof_cause()
  File "/home/cetc15/anaconda3/envs/glm3/lib/python3.11/site-packages/xoscar/backends/pool.py", line 657, in send
    result = await self._run_coro(message.message_id, coro)
    ^^^^^^^^^^^^^^^^^
  File "/home/cetc15/anaconda3/envs/glm3/lib/python3.11/site-packages/xoscar/backends/pool.py", line 368, in _run_coro
    return await coro
  File "/home/cetc15/anaconda3/envs/glm3/lib/python3.11/site-packages/xoscar/api.py", line 384, in __on_receive__
    return await super().__on_receive__(message)  # type: ignore
    ^^^^^^^^^^^^^^^^^
  File "xoscar/core.pyx", line 558, in __on_receive__
    raise ex
  File "xoscar/core.pyx", line 520, in xoscar.core._BaseActor.__on_receive__
    async with self._lock:
    ^^^^^^^^^^^^^^^^^
  File "xoscar/core.pyx", line 521, in xoscar.core._BaseActor.__on_receive__
    with debug_async_timeout('actor_lock_timeout',
    ^^^^^^^^^^^^^^^^^
  File "xoscar/core.pyx", line 526, in xoscar.core._BaseActor.__on_receive__
    result = await result
    ^^^^^^^^^^^^^^^^^
  File "/home/cetc15/anaconda3/envs/glm3/lib/python3.11/site-packages/xinference/core/utils.py", line 43, in wrapped
    ret = await func(*args, **kwargs)
    ^^^^^^^^^^^^^^^^^
  File "/home/cetc15/anaconda3/envs/glm3/lib/python3.11/site-packages/xinference/core/model.py", line 78, in wrapped_func
    ret = await fn(self, *args, **kwargs)
    ^^^^^^^^^^^^^^^^^
  File "/home/cetc15/anaconda3/envs/glm3/lib/python3.11/site-packages/xoscar/api.py", line 462, in _wrapper
    r = await func(self, *args, **kwargs)
    ^^^^^^^^^^^^^^^^^
  File "/home/cetc15/anaconda3/envs/glm3/lib/python3.11/site-packages/xinference/core/model.py", line 374, in chat
    response = await self._call_wrapper(
    ^^^^^^^^^^^^^^^^^
  File "/home/cetc15/anaconda3/envs/glm3/lib/python3.11/site-packages/xinference/core/model.py", line 102, in _async_wrapper
    return await fn(*args, **kwargs)
    ^^^^^^^^^^^^^^^^^
  File "/home/cetc15/anaconda3/envs/glm3/lib/python3.11/site-packages/xinference/core/model.py", line 324, in _call_wrapper
    ret = await fn(*args, **kwargs)
    ^^^^^^^^^^^^^^^^^
  File "/home/cetc15/anaconda3/envs/glm3/lib/python3.11/site-packages/xinference/model/llm/vllm/core.py", line 415, in async_chat
    return self._tool_calls_completion(
    ^^^^^^^^^^^^^^^^^
  File "/home/cetc15/anaconda3/envs/glm3/lib/python3.11/site-packages/xinference/model/llm/utils.py", line 551, in _tool_calls_completion
    content, func, args = cls._eval_chatglm3_arguments(c, tools)
    ^^^^^^^^^^^^^^^^^
  File "/home/cetc15/anaconda3/envs/glm3/lib/python3.11/site-packages/xinference/model/llm/utils.py", line 511, in _eval_chatglm3_arguments
    if isinstance(c[0], str):
    ^^^^^^^^^^^^^^^^^
KeyError: [address=0.0.0.0:42925, pid=69006] 0

使用Chatglm官方提供的api-server没有类似的问题,请大佬帮忙看看

aresnow1 commented 7 months ago

我们尝试复现一下,可以先试试启动 xinference 指定 XINFERENCE_DISABLE_VLLM=1 看行不行。 XINFERENCE_DISABLE_VLLM=1 xinference-local -H 0.0.0.0 --log-level=debug

ChiayenGu commented 7 months ago

我们尝试复现一下,可以先试试启动 xinference 指定 XINFERENCE_DISABLE_VLLM=1 看行不行。 XINFERENCE_DISABLE_VLLM=1 xinference-local -H 0.0.0.0 --log-level=debug

今早用您给的命令试了一下,可以正常调用工具了,但出现了tempurature不能设为0.0的问题。改成0.01正常输出。

报错是

2024-02-01 09:04:34,695 xinference.api.restful_api 73901 ERROR    [address=0.0.0.0:43615, pid=74082] `temperature` (=0.0) has to be a strictly positive float, otherwise your next token scores will be invalid. If you're looking for greedy decoding strategies, set `do_sample=False`.
Traceback (most recent call last):
  File "/home/cetc15/anaconda3/envs/glm3/lib/python3.11/site-packages/xinference/api/restful_api.py", line 1207, in create_chat_completion
    data = await model.chat(prompt, system_prompt, chat_history, kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/cetc15/anaconda3/envs/glm3/lib/python3.11/site-packages/xoscar/backends/context.py", line 227, in send
    return self._process_result_message(result)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/cetc15/anaconda3/envs/glm3/lib/python3.11/site-packages/xoscar/backends/context.py", line 102, in _process_result_message
    raise message.as_instanceof_cause()
  File "/home/cetc15/anaconda3/envs/glm3/lib/python3.11/site-packages/xoscar/backends/pool.py", line 657, in send
    result = await self._run_coro(message.message_id, coro)
    ^^^^^^^^^^^^^^^^^
  File "/home/cetc15/anaconda3/envs/glm3/lib/python3.11/site-packages/xoscar/backends/pool.py", line 368, in _run_coro
    return await coro
  File "/home/cetc15/anaconda3/envs/glm3/lib/python3.11/site-packages/xoscar/api.py", line 384, in __on_receive__
    return await super().__on_receive__(message)  # type: ignore
    ^^^^^^^^^^^^^^^^^
  File "xoscar/core.pyx", line 558, in __on_receive__
    raise ex
  File "xoscar/core.pyx", line 520, in xoscar.core._BaseActor.__on_receive__
    async with self._lock:
    ^^^^^^^^^^^^^^^^^
  File "xoscar/core.pyx", line 521, in xoscar.core._BaseActor.__on_receive__
    with debug_async_timeout('actor_lock_timeout',
    ^^^^^^^^^^^^^^^^^
  File "xoscar/core.pyx", line 526, in xoscar.core._BaseActor.__on_receive__
    result = await result
    ^^^^^^^^^^^^^^^^^
  File "/home/cetc15/anaconda3/envs/glm3/lib/python3.11/site-packages/xinference/core/utils.py", line 43, in wrapped
    ret = await func(*args, **kwargs)
    ^^^^^^^^^^^^^^^^^
  File "/home/cetc15/anaconda3/envs/glm3/lib/python3.11/site-packages/xinference/core/model.py", line 78, in wrapped_func
    ret = await fn(self, *args, **kwargs)
    ^^^^^^^^^^^^^^^^^
  File "/home/cetc15/anaconda3/envs/glm3/lib/python3.11/site-packages/xoscar/api.py", line 462, in _wrapper
    r = await func(self, *args, **kwargs)
    ^^^^^^^^^^^^^^^^^
  File "/home/cetc15/anaconda3/envs/glm3/lib/python3.11/site-packages/xinference/core/model.py", line 369, in chat
    response = await self._call_wrapper(
    ^^^^^^^^^^^^^^^^^
  File "/home/cetc15/anaconda3/envs/glm3/lib/python3.11/site-packages/xinference/core/model.py", line 102, in _async_wrapper
    return await fn(*args, **kwargs)
    ^^^^^^^^^^^^^^^^^
  File "/home/cetc15/anaconda3/envs/glm3/lib/python3.11/site-packages/xinference/core/model.py", line 326, in _call_wrapper
    ret = await asyncio.to_thread(fn, *args, **kwargs)
    ^^^^^^^^^^^^^^^^^
  File "/home/cetc15/anaconda3/envs/glm3/lib/python3.11/asyncio/threads.py", line 25, in to_thread
    return await loop.run_in_executor(None, func_call)
      ^^^^^^^^^^^^^^^^^
  File "/home/cetc15/anaconda3/envs/glm3/lib/python3.11/concurrent/futures/thread.py", line 58, in run
    result = self.fn(*self.args, **self.kwargs)
    ^^^^^^^^^^^^^^^^^
  File "/home/cetc15/anaconda3/envs/glm3/lib/python3.11/site-packages/xinference/model/llm/pytorch/chatglm.py", line 137, in chat
    msg = self._model.chat(
    ^^^^^^^^^^^^^^^^^
  File "/home/cetc15/anaconda3/envs/glm3/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
    ^^^^^^^^^^^^^^^^^
  File "/home/cetc15/.cache/huggingface/modules/transformers_modules/chatglm3-pytorch-6b/modeling_chatglm.py", line 1035, in chat
    outputs = self.generate(**inputs, **gen_kwargs, eos_token_id=eos_token_id)
    ^^^^^^^^^^^^^^^^^
  File "/home/cetc15/anaconda3/envs/glm3/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
    ^^^^^^^^^^^^^^^^^
  File "/home/cetc15/anaconda3/envs/glm3/lib/python3.11/site-packages/transformers/generation/utils.py", line 1753, in generate
    logits_warper = self._get_logits_warper(generation_config)
    ^^^^^^^^^^^^^^^^^
  File "/home/cetc15/anaconda3/envs/glm3/lib/python3.11/site-packages/transformers/generation/utils.py", line 917, in _get_logits_warper
    warpers.append(TemperatureLogitsWarper(generation_config.temperature))
    ^^^^^^^^^^^^^^^^^
  File "/home/cetc15/anaconda3/envs/glm3/lib/python3.11/site-packages/transformers/generation/logits_process.py", line 277, in __init__
    raise ValueError(except_msg)
    ^^^^^^^^^^^^^^^^^
qinxuye commented 7 months ago

@codingl2k1 看看能不能复现这个问题。

codingl2k1 commented 7 months ago

temperature不能是0的问题应该是transformers本身的问题:https://github.com/huggingface/transformers/blob/main/src/transformers/generation/logits_process.py#L271

github-actions[bot] commented 3 weeks ago

This issue is stale because it has been open for 7 days with no activity.

github-actions[bot] commented 3 weeks ago

This issue was closed because it has been inactive for 5 days since being marked as stale.