QwenLM / Qwen2-VL

Qwen2-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.
Apache License 2.0
3.07k stars 186 forks source link

通义千问2-VL-2B-Instruct-GPTQ-Int4不支持多轮图片识别 #529

Open HaoWang81 opened 2 days ago

HaoWang81 commented 2 days ago

通义千问2-VL-2B-Instruct-GPTQ-Int4不支持多轮图片识别 错误提示: { "object": "error", "message": "At most 1 image(s) may be provided in one request.", "type": "BadRequestError", "param": null, "code": 400 }

HaoWang81 commented 2 days ago

vllm 请求地址 http://127.0.0.1:9999/v1/chat/completions 请求入参


{
    "model": "Qwen2-VL-2B-Instruct-GPTQ-Int4",
    "stream": true,
    "messages": [
        {
            "role": "user",
            "content": [
                {
                    "type": "image_url",
                    "image_url": {
                        "url": "https://modelscope.oss-cn-beijing.aliyuncs.com/resource/qwen.png"
                    }
                },
                {
                    "type": "text",
                    "text": "这张图片描述的什么内容"
                }
            ]
        },
        {
            "role": "assiant",
            "content": "这张图片包含“TONGYI Qwen”两个文字"
        },
        {
            "role": "user",
            "content": [
                {
                    "type": "image_url",
                    "image_url": {
                        "url": "https://modelscope.oss-cn-beijing.aliyuncs.com/resource/qwen.png"
                    }
                },
                {
                    "type": "text",
                    "text": "这张图片描述的什么内容"
                }
            ]
        },
        {
            "role": "assiant",
            "content": ""
        }
    ]
}