xorbitsai / inference

Replace OpenAI GPT with another LLM in your app by changing a single line of code. Xinference gives you the freedom to use any LLM you need. With Xinference, you're empowered to run inference with any open-source language models, speech recognition models, and multimodal models, whether in the cloud, on-premises, or even on your laptop.
https://inference.readthedocs.io
Apache License 2.0
5.41k stars 438 forks source link

最新版本的xinference无法正常启动qwen2-vl-instruct模型 #2554

Open majestichou opened 3 days ago

majestichou commented 3 days ago

System Info / 系統信息

cuda 12.2,centos7

Running Xinference with Docker? / 是否使用 Docker 运行 Xinfernece?

Version info / 版本信息

V0.16.3

The command used to start Xinference / 用以启动 xinference 的命令

docker run -d -v /home/llm-test/embedding_and_rerank_model:/root/models -p 9998:9997 --gpus all xprobe/xinference:latest xinference-local -H 0.0.0.0

Reproduction / 复现过程

  1. 下载模型Qwen2-VL-7B-Instruct到目标目录:/home/llm-test/embedding_and_rerank_model
  2. 采用命令 docker run -d -v /home/llm-test/embedding_and_rerank_model:/root/models -p 9998:9997 --gpus all xprobe/xinference:latest xinference-local -H 0.0.0.0启动Xinference
  3. 然后打开网页,选择Launch Model,选中qwen2-vl-instruct模型,Model Path填写为/root/models/Qwen2-VL-7B-Instruct,单击启动按钮
  4. 运行报错,报错信息如下
    2024-11-14 08:35:27,521 xinference.core.worker 140 ERROR    Failed to load model qwen2-vl-instruct-0
    Traceback (most recent call last):
    File "/usr/local/lib/python3.10/dist-packages/xinference/core/worker.py", line 897, in launch_builtin_model
    await model_ref.load()
    File "/usr/local/lib/python3.10/dist-packages/xoscar/backends/context.py", line 231, in send
    return self._process_result_message(result)
    File "/usr/local/lib/python3.10/dist-packages/xoscar/backends/context.py", line 102, in _process_result_message
    raise message.as_instanceof_cause()
    File "/usr/local/lib/python3.10/dist-packages/xoscar/backends/pool.py", line 659, in send
    result = await self._run_coro(message.message_id, coro)
    File "/usr/local/lib/python3.10/dist-packages/xoscar/backends/pool.py", line 370, in _run_coro
    return await coro
    File "/usr/local/lib/python3.10/dist-packages/xoscar/api.py", line 384, in __on_receive__
    return await super().__on_receive__(message)  # type: ignore
    File "xoscar/core.pyx", line 558, in __on_receive__
    raise ex
    File "xoscar/core.pyx", line 520, in xoscar.core._BaseActor.__on_receive__
    async with self._lock:
    File "xoscar/core.pyx", line 521, in xoscar.core._BaseActor.__on_receive__
    with debug_async_timeout('actor_lock_timeout',
    File "xoscar/core.pyx", line 526, in xoscar.core._BaseActor.__on_receive__
    result = await result
    File "/usr/local/lib/python3.10/dist-packages/xinference/core/model.py", line 398, in load
    self._model.load()
    File "/usr/local/lib/python3.10/dist-packages/xinference/model/llm/transformers/qwen2_vl.py", line 53, in load
    from transformers import AutoProcessor, Qwen2VLForConditionalGeneration
    ImportError: [address=0.0.0.0:43921, pid=1213] cannot import name 'Qwen2VLForConditionalGeneration' from 'transformers' (/usr/local/lib/python3.10/dist-packages/transformers/__init__.py)

Expected behavior / 期待表现

能够正常启动模型

codingl2k1 commented 3 days ago

你的transformers版本是多少?可以尝试更新一下transformers.

majestichou commented 3 days ago

@codingl2k1 啊?镜像里面自带的还不行吗?我用的docker镜像

jacobdong commented 1 day ago

@codingl2k1
qwen2-audio 模型也是一样

2024-11-17 16:25:41 ImportError: [address=0.0.0.0:46487, pid=176] cannot import name 'Qwen2AudioForConditionalGeneration' from 'transformers' (/usr/local/lib/python3.10/dist-packages/transformers/init.py)

ChiayenGu commented 7 hours ago

我也遇到了这个问题,docker版本是0.16.0

JumpNew commented 5 hours ago

升级transformers到最新的版本,就可以启动了