xorbitsai / inference

Replace OpenAI GPT with another LLM in your app by changing a single line of code. Xinference gives you the freedom to use any LLM you need. With Xinference, you're empowered to run inference with any open-source language models, speech recognition models, and multimodal models, whether in the cloud, on-premises, or even on your laptop.
https://inference.readthedocs.io
Apache License 2.0
4.53k stars 355 forks source link

BUG When I reasoning the model Qwen-VL-Chat-Int4 and Yi-VL-6B, the Model Engine cannot be found #1664

Closed okwinds closed 3 weeks ago

okwinds commented 2 months ago

First, register the model as shown in the following screenshot.

image

Second, find the Qwen-VL-Chat-Int4 inference in the list of custom models, as shown in the following screenshot

image

Qwen-VL-Chat-Int4

image image

xinference, version 0.12.1

okwinds commented 2 months ago

Additional explanation Yi-VL-6B has the same issue.

image image
ChengjieLi28 commented 2 months ago

@okwinds provide us the full screenshot. Did you select vision for VL models in Model Abilities section on the UI?

okwinds commented 2 months ago

@okwinds provide us the full screenshot. Did you select vision for VL models in Model Abilities section on the UI?

yep, Registered again

image

image

image

yiyangshen commented 2 months ago

Do not choose Generate in Abilities.

okwinds commented 2 months ago

Do not choose Generate in Abilities.不要选择“能力”中的“生成”。

tried again it cannot work

"model_ability": [
    "vision",
    "chat"
],
amumu96 commented 1 month ago

This error cannot be reproduced in version 0.13.1, please try upgrading to the latest version. @okwinds

okwinds commented 1 month ago

The version of Xinference that I updated.

now, xinference, version 0.13.2.

The same operation process, the same problems still exist.

json:

{
    "version": 1,
    "context_length": 20000,
    "model_name": "Yi-VL-6B",
    "model_lang": [
        "en",
        "zh"
    ],
    "model_ability": [
        "chat",
        "vision"
    ],
    "model_description": "/home/llm/yi/Yi-VL-6B",
    "model_family": "yi-vl-chat",
    "model_specs": [
        {
            "model_format": "pytorch",
            "model_size_in_billions": 6,
            "quantizations": [
                "none"
            ],
            "model_id": null,
            "model_hub": "huggingface",
            "model_uri": "/home/llm/yi/Yi-VL-6B",
            "model_revision": null
        }
    ],
    "prompt_style": {
        "style_name": "CHATML",
        "system_prompt": "",
        "roles": [
            "<|im_start|>user",
            "<|im_start|>assistant"
        ],
        "intra_message_sep": "<|im_end|>",
        "inter_message_sep": "",
        "stop": [
            "<|endoftext|>",
            "<|im_start|>",
            "<|im_end|>",
            "<|im_sep|>"
        ],
        "stop_token_ids": [
            2,
            6,
            7,
            8
        ]
    },
    "is_builtin": false
}

@amumu96 @qinxuye

xueqizhang121 commented 1 month ago

插眼,我使用0.11.3和0.13.1都存在这个问题

lorra1990 commented 1 month ago

0.13.2版本 问题还是存在的 如果之前部署过其他模型有config缓存,可以正常配置,如果没有缓存的引擎选择不了,下拉框为空 企业微信截图_7901af4f-795d-426b-b812-fdce57daa3ca

ricky977 commented 4 weeks ago

same question!

Yog-AI commented 3 weeks ago

0.12.0的含金量还在上升,我通过版本降级解决了 pip install xinference==0.12.0

ricky977 commented 3 weeks ago

0.12.0的含金量还在上升,我通过版本降级解决了 pip install xinference==0.12.0 Pip install xinference==0.12.0 Is xinference [all] installed by default and how to install xinference [transformers]

qinxuye commented 3 weeks ago

主分支应该已经解决,本周发版后请尝试。使用镜像的话可以尝试拉 nightly-main 分支。