xorbitsai / inference

Replace OpenAI GPT with another LLM in your app by changing a single line of code. Xinference gives you the freedom to use any LLM you need. With Xinference, you're empowered to run inference with any open-source language models, speech recognition models, and multimodal models, whether in the cloud, on-premises, or even on your laptop.
https://inference.readthedocs.io
Apache License 2.0
5k stars 397 forks source link

autodl上使用xinference加载qwen2-72b-instruct模型,运行不起来 #2012

Open qq745639151 opened 2 months ago

qq745639151 commented 2 months ago

System Info / 系統信息

image 我创建的虚拟环境,Python版本用的是3.11.9

Running Xinference with Docker? / 是否使用 Docker 运行 Xinfernece?

Version info / 版本信息

image

The command used to start Xinference / 用以启动 xinference 的命令

xinference-local --host 0.0.0.0 --port 9997

Reproduction / 复现过程

注册模型,参数如下: { "version": 1, "context_length": 30720, "model_name": "qwen2-72b-instruct", "model_lang": [ "en", "zh" ], "model_ability": [ "generate", "chat", "vision" ], "model_description": "This is a custom model description.", "model_family": "qwen2-instruct", "model_specs": [ { "model_format": "pytorch", "model_size_in_billions": 72, "quantizations": [ "none" ], "model_id": null, "model_hub": "huggingface", "model_uri": "/root/autodl-tmp/xinfhome/Qwen2-72B-Instruct", "model_revision": null } ], "prompt_style": { "style_name": "QWEN", "system_prompt": "You are a helpful assistant.", "roles": [ "user", "assistant" ], "intra_message_sep": "\n", "inter_message_sep": "", "stop": [ "<|endoftext|>", "<|im_start|>", "<|im_end|>" ], "stop_token_ids": [ 151643, 151644, 151645 ] }, "is_builtin": false }

启动模型: 然后就一直等待,但是一直运行不起来,查看日志,日志也停在这里,如下图所示: image

网页上如下图所示: 853af01a7e4cffe5c76481ad3a7c1cf

Expected behavior / 期待表现

希望能把模型运行起来

qinxuye commented 2 months ago

vllm 降到 0.4.2 看下。