Open yuxi9264 opened 5 days ago
查看模型列表http://127.0.0.1:9997/v1/models
SyncPage[Model](data=[Model(id='qwen2-vl-7b-instruct', created=0, object='model', owned_by='xinference', model_type='LLM', address='10.27.164.119:62479', accelerators=['0'], model_name='qwen2-vl-7b-instruct', model_lang=['en', 'zh'], model_ability=['generate', 'chat', 'vision'], model_description='This is a qwen-vl-7b-instruct model', model_format='pytorch', model_size_in_billions=7, model_family='qwen2-vl-instruct', quantization='none', model_hub='huggingface', revision=None, context_length=8192, replica=1)], object='list') 10.27.164.119 是内网地址
我还没用过 xinference 1.0.0 我用 vllm 部署的没有问题 https://github.com/Valdanitooooo/chat_with_qwen2_vl_test/blob/main/deploy/docker-compose.yml
是不是跑挂了?后面不需要加0,那个是副本 id。
是不是跑挂了?后面不需要加0,那个是副本 id。
有可能是的,我的显卡只是一张16G的,还是在windows上进行全参数运行。 使用openai接口 第一次调用时,会显示“远程服务器拒绝了连接请求”,第二次调用时,则会提示找不到“qwen2-vl-7b-instruct-0”模型了,在xinference页面上使用是没问题的。 在xinference页面上加载deepseek-vl-7b-chat 模型,使用openai的接口调用则无问题,有点晕了。
我也遇到同样的问题
你的显卡和版本信息发上来,咱俩对一下呢发自我的 iPhone在 2024年11月29日,17:36,Song XinXin @.***> 写道: 我也遇到同样的问题
—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you authored the thread.Message ID: @.***>
System Info / 系統信息
Cuda版本:12.1 Pytorch版本:2.5.1 操作系统:WIN10 python版本:3.9.11 transformers: 4.46.3
Running Xinference with Docker? / 是否使用 Docker 运行 Xinfernece?
Version info / 版本信息
xinference 1.0.0
The command used to start Xinference / 用以启动 xinference 的命令
xinference-local --host 127.0.0.1 --port 9997
Reproduction / 复现过程
在xinference界面启动qwen2-vl-7b-instruct
导入openai:from openai import OpenAI
openai_base_url = "http://127.0.0.1:9997/v1" client = OpenAI( api_key="EMPTY", base_url=openai_base_url)
page_image_path = "D:\PythonDemo\pdf_to_md_demo\pdf2markdown\output\0.png"
messages2 = [{ "role": "user", "content": [{ "type": "image_url", "image_url": { "url": f"data::image/png;base64,{encode_base64_content_from_local(page_image_path)}", #将图片转为二进制 }, {"type": "text", "text": "图片中的内容是什么?"} }] resp = client.chat.completions.create( messages=messages2, model="qwen2-vl-7b-instruct", temperature=0.2, ) print(resp)
运行结果 File "D:\PythonDemo\pdf_to_md_demo\pdf2markdown\pdf_to_markdown.py", line 264, in openai_test resp = client.chat.completions.create( ^^^^^^^^^^^^^^^^^^^^ File "D:\PythonDemo\pdf_to_md_demo\venv\Lib\site-packages\openai_utils_utils.py", line 275, in wrapper return func(*args, **kwargs) ^^^^^^^^^^^^^^ File "D:\PythonDemo\pdf_to_md_demo\venv\Lib\site-packages\openai\resources\chat\completions.py", line 829, in create return self._post( ^^^^^^^ File "D:\PythonDemo\pdf_to_md_demo\venv\Lib\site-packages\openai_base_client.py", line 1278, in post return cast(ResponseT, self.request(cast_to, opts, stream=stream, stream_cls=stream_cls)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\PythonDemo\pdf_to_md_demo\venv\Lib\site-packages\openai_base_client.py", line 955, in request return self._request( ^^^^^^^^ File "D:\PythonDemo\pdf_to_md_demo\venv\Lib\site-packages\openai_base_client.py", line 1059, in _request raise self._make_status_error_from_response(err.response) from None
openai.BadRequestError: Error code: 400 - {'detail': '[address=10.27.164.119:61057, pid=36672] Model not found, uid: qwen2-vl-7b-instruct-0'}
Expected behavior / 期待表现
希望以上的错误能够得到解答,使用openai接口调用xinference时能够正常运行。