xorbitsai / inference

Replace OpenAI GPT with another LLM in your app by changing a single line of code. Xinference gives you the freedom to use any LLM you need. With Xinference, you're empowered to run inference with any open-source language models, speech recognition models, and multimodal models, whether in the cloud, on-premises, or even on your laptop.
https://inference.readthedocs.io
Apache License 2.0
4.86k stars 385 forks source link

使用 Dify 智能助手通过 xinference 与 chatglm3-6b 对话报错 #1274

Closed andylzming closed 1 month ago

andylzming commented 5 months ago

使用 Dify 智能助手通过 xinference 与 chatglm3-6b 对话报错,基础助手对话正常。另外,使用 qwen-14 无论 基础助手还是智能助手都是正常的。

INFO 04-10 16:30:35 llm_engine.py:653] Avg prompt throughput: 24.8 tokens/s, Avg generation throughput: 6.5 tokens/s, Running: 1 reqs, Swapped: 0 reqs, Pending: 0 reqs, GPU KV cache usage: 0.0%, CPU KV cache usage: 0.0%
INFO 04-10 16:30:35 async_llm_engine.py:111] Finished request 2f3a1d40-f779-11ee-b1b4-80615f20f615.
2024-04-10 16:30:35,419 xinference.api.restful_api 27390 ERROR    [address=127.0.0.1:34773, pid=24418] 0
Traceback (most recent call last):
  File "/home/miniconda3/envs/xinference/lib/python3.10/site-packages/xinference/api/restful_api.py", line 1394, in create_chat_completion
    data = await model.chat(prompt, system_prompt, chat_history, kwargs)
  File "/home/miniconda3/envs/xinference/lib/python3.10/site-packages/xoscar/backends/context.py", line 227, in send
    return self._process_result_message(result)
  File "/home/miniconda3/envs/xinference/lib/python3.10/site-packages/xoscar/backends/context.py", line 102, in _process_result_message
    raise message.as_instanceof_cause()
  File "/home/miniconda3/envs/xinference/lib/python3.10/site-packages/xoscar/backends/pool.py", line 659, in send
    result = await self._run_coro(message.message_id, coro)
  File "/home/miniconda3/envs/xinference/lib/python3.10/site-packages/xoscar/backends/pool.py", line 370, in _run_coro
    return await coro
  File "/home/miniconda3/envs/xinference/lib/python3.10/site-packages/xoscar/api.py", line 384, in __on_receive__
    return await super().__on_receive__(message)  # type: ignore
  File "xoscar/core.pyx", line 558, in __on_receive__
    raise ex
  File "xoscar/core.pyx", line 520, in xoscar.core._BaseActor.__on_receive__
    async with self._lock:
  File "xoscar/core.pyx", line 521, in xoscar.core._BaseActor.__on_receive__
    with debug_async_timeout('actor_lock_timeout',
  File "xoscar/core.pyx", line 526, in xoscar.core._BaseActor.__on_receive__
    result = await result
  File "/home/miniconda3/envs/xinference/lib/python3.10/site-packages/xinference/core/utils.py", line 45, in wrapped
    ret = await func(*args, **kwargs)
  File "/home/miniconda3/envs/xinference/lib/python3.10/site-packages/xinference/core/model.py", line 79, in wrapped_func
    ret = await fn(self, *args, **kwargs)
  File "/home/miniconda3/envs/xinference/lib/python3.10/site-packages/xoscar/api.py", line 462, in _wrapper
    r = await func(self, *args, **kwargs)
  File "/home/miniconda3/envs/xinference/lib/python3.10/site-packages/xinference/core/model.py", line 375, in chat
    response = await self._call_wrapper(
  File "/home/miniconda3/envs/xinference/lib/python3.10/site-packages/xinference/core/model.py", line 103, in _async_wrapper
    return await fn(*args, **kwargs)
  File "/home/miniconda3/envs/xinference/lib/python3.10/site-packages/xinference/core/model.py", line 325, in _call_wrapper
    ret = await fn(*args, **kwargs)
  File "/home/miniconda3/envs/xinference/lib/python3.10/site-packages/xinference/model/llm/vllm/core.py", line 439, in async_chat
    return self._tool_calls_completion(
  File "/home/miniconda3/envs/xinference/lib/python3.10/site-packages/xinference/model/llm/utils.py", line 601, in _tool_calls_completion
    content, func, args = cls._eval_chatglm3_arguments(c, tools)
  File "/home/miniconda3/envs/xinference/lib/python3.10/site-packages/xinference/model/llm/utils.py", line 548, in _eval_chatglm3_arguments
    if isinstance(c[0], str):
KeyError: [address=127.0.0.1:34773, pid=24418] 0
qinxuye commented 5 months ago

Looks like the same issue with #1230 .

andylzming commented 5 months ago

Looks like the same issue with #1230 .

The issues have not been resolved.

he498 commented 4 months ago

我也出现了这个问题。我的环境是 cuda 11.8,python 3.10.14。自行在hugging face下载的chatglm3-6b模型,在xinference进行自定义模型的注册。使用dify接入xinference。在开启工具时,请求到xinference中,就会出现该错误。注册自定义模型以及启动运行模型的参数如图 a6e941cbcbaf68ab249d77a418c00f41182df87c43861b9a79dbbd54a1c4ba27QzpcVXNlcnNcaGVyaW1cQXBwRGF0YVxSb2FtaW5nXERpbmdUYWxrXDE3ODg3MTg1ODFfdjJcSW1hZ2VGaWxlc1wxNzE1NTg5NjYyNjE0X0QwM0QwMkRELTg3Q0MtNGFhZS05RTQ1LTBDREExODFBRjg4Qi5wbmc= bf6d1d9f6bbefe20e18ed9c1ae6945caeeefc36a2a0daa71f54ee4c85949710dQzpcVXNlcnNcaGVyaW1cQXBwRGF0YVxSb2FtaW5nXERpbmdUYWxrXDE3ODg3MTg1ODFfdjJcSW1hZ2VGaWxlc1wxNzE1NTg5ODY0NTk2XzU5NzA2MUMzLUUzRDktNGZjOC04NjNBLTFGRDdCNEQwOTU2MC5wbmc=

github-actions[bot] commented 1 month ago

This issue is stale because it has been open for 7 days with no activity.

github-actions[bot] commented 1 month ago

This issue was closed because it has been inactive for 5 days since being marked as stale.