Replace OpenAI GPT with another LLM in your app by changing a single line of code. Xinference gives you the freedom to use any LLM you need. With Xinference, you're empowered to run inference with any open-source language models, speech recognition models, and multimodal models, whether in the cloud, on-premises, or even on your laptop.
System Info / 系統信息
我创建的虚拟环境,Python版本用的是3.11.9
Running Xinference with Docker? / 是否使用 Docker 运行 Xinfernece?
Version info / 版本信息
The command used to start Xinference / 用以启动 xinference 的命令
xinference-local --host 0.0.0.0 --port 9997
Reproduction / 复现过程
注册模型,参数如下: { "version": 1, "context_length": 30720, "model_name": "qwen2-72b-instruct", "model_lang": [ "en", "zh" ], "model_ability": [ "generate", "chat", "vision" ], "model_description": "This is a custom model description.", "model_family": "qwen2-instruct", "model_specs": [ { "model_format": "pytorch", "model_size_in_billions": 72, "quantizations": [ "none" ], "model_id": null, "model_hub": "huggingface", "model_uri": "/root/autodl-tmp/xinfhome/Qwen2-72B-Instruct", "model_revision": null } ], "prompt_style": { "style_name": "QWEN", "system_prompt": "You are a helpful assistant.", "roles": [ "user", "assistant" ], "intra_message_sep": "\n", "inter_message_sep": "", "stop": [ "<|endoftext|>", "<|im_start|>", "<|im_end|>" ], "stop_token_ids": [ 151643, 151644, 151645 ] }, "is_builtin": false }
启动模型: 然后就一直等待,但是一直运行不起来,查看日志,日志也停在这里,如下图所示:
网页上如下图所示:
Expected behavior / 期待表现
希望能把模型运行起来