使用xinference 1.0.0 能启动并运行qwen2-vl-7b-instruct，但是使用openai-api接口会出现问题，找不到模型qwen2-vl-7b-instruct-0

yuxi9264 commented 5 days ago

System Info / 系統信息

Cuda版本：12.1 Pytorch版本：2.5.1 操作系统：WIN10 python版本：3.9.11 transformers: 4.46.3

Running Xinference with Docker? / 是否使用 Docker 运行 Xinfernece？

[ ] docker / docker
[X] pip install / 通过 pip install 安装
[ ] installation from source / 从源码安装

Version info / 版本信息

xinference 1.0.0

The command used to start Xinference / 用以启动 xinference 的命令

xinference-local --host 127.0.0.1 --port 9997

Reproduction / 复现过程

在xinference界面启动qwen2-vl-7b-instruct
导入openai：from openai import OpenAI
openai_base_url = "http://127.0.0.1:9997/v1" client = OpenAI( api_key="EMPTY", base_url=openai_base_url)
page_image_path = "D:\PythonDemo\pdf_to_md_demo\pdf2markdown\output\0.png"
messages2 = [{ "role": "user", "content": [{ "type": "image_url", "image_url": { "url": f"data::image/png;base64,{encode_base64_content_from_local(page_image_path)}", #将图片转为二进制 }, {"type": "text", "text": "图片中的内容是什么？"} }] resp = client.chat.completions.create( messages=messages2, model="qwen2-vl-7b-instruct", temperature=0.2, ) print(resp)
运行结果 File "D:\PythonDemo\pdf_to_md_demo\pdf2markdown\pdf_to_markdown.py", line 264, in openai_test resp = client.chat.completions.create( ^^^^^^^^^^^^^^^^^^^^ File "D:\PythonDemo\pdf_to_md_demo\venv\Lib\site-packages\openai_utils_utils.py", line 275, in wrapper return func(*args, **kwargs) ^^^^^^^^^^^^^^ File "D:\PythonDemo\pdf_to_md_demo\venv\Lib\site-packages\openai\resources\chat\completions.py", line 829, in create return self._post( ^^^^^^^ File "D:\PythonDemo\pdf_to_md_demo\venv\Lib\site-packages\openai_base_client.py", line 1278, in post return cast(ResponseT, self.request(cast_to, opts, stream=stream, stream_cls=stream_cls)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\PythonDemo\pdf_to_md_demo\venv\Lib\site-packages\openai_base_client.py", line 955, in request return self._request( ^^^^^^^^ File "D:\PythonDemo\pdf_to_md_demo\venv\Lib\site-packages\openai_base_client.py", line 1059, in _request raise self._make_status_error_from_response(err.response) from None

openai.BadRequestError: Error code: 400 - {'detail': '[address=10.27.164.119:61057, pid=36672] Model not found, uid: qwen2-vl-7b-instruct-0'}

不懂为什么输入的model是：qwen2-vl-7b-instruct，但是会去找：qwen2-vl-7b-instruct-0
但是如果将xinference中的uid改为：qwen2-vl-7b-instruct-0，又会报找不到：qwen2-vl-7b-instruct

Expected behavior / 期待表现

希望以上的错误能够得到解答，使用openai接口调用xinference时能够正常运行。

Valdanitooooo commented 5 days ago

检查模型列表 http://127.0.0.1:9997/v1/models

yuxi9264 commented 5 days ago

查看模型列表http://127.0.0.1:9997/v1/models

SyncPage[Model](data=[Model(id='qwen2-vl-7b-instruct', created=0, object='model', owned_by='xinference', model_type='LLM', address='10.27.164.119:62479', accelerators=['0'], model_name='qwen2-vl-7b-instruct', model_lang=['en', 'zh'], model_ability=['generate', 'chat', 'vision'], model_description='This is a qwen-vl-7b-instruct model', model_format='pytorch', model_size_in_billions=7, model_family='qwen2-vl-instruct', quantization='none', model_hub='huggingface', revision=None, context_length=8192, replica=1)], object='list') 10.27.164.119 是内网地址

Valdanitooooo commented 5 days ago

我还没用过 xinference 1.0.0 我用 vllm 部署的没有问题 https://github.com/Valdanitooooo/chat_with_qwen2_vl_test/blob/main/deploy/docker-compose.yml

qinxuye commented 5 days ago

是不是跑挂了？后面不需要加0，那个是副本 id。

yuxi9264 commented 4 days ago

是不是跑挂了？后面不需要加0，那个是副本 id。

有可能是的，我的显卡只是一张16G的，还是在windows上进行全参数运行。使用openai接口第一次调用时，会显示“远程服务器拒绝了连接请求”，第二次调用时，则会提示找不到“qwen2-vl-7b-instruct-0”模型了，在xinference页面上使用是没问题的。在xinference页面上加载deepseek-vl-7b-chat 模型，使用openai的接口调用则无问题，有点晕了。

goodsxx commented 3 days ago

我也遇到同样的问题

yuxi9264 commented 6 hours ago

你的显卡和版本信息发上来，咱俩对一下呢发自我的 iPhone在 2024年11月29日，17:36，Song XinXin @.***> 写道：我也遇到同样的问题

—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you authored the thread.Message ID: @.***>

xorbitsai / inference