QwenLM / Qwen2-VL

Qwen2-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.
Apache License 2.0
3.35k stars 208 forks source link

现在的vllm部署方案是否支持多图? #63

Closed uYu closed 2 months ago

uYu commented 2 months ago

现在的vllm部署方案是否支持多图? 如果支持,请求是什么样子的?如果不支持,是否将来会支持?

ShuaiBai623 commented 2 months ago

支持的,直接将图加到你想加的位置拼成list就行,可以参考这里

from openai import OpenAI

# Set OpenAI's API key and API base to use vLLM's API server.
openai_api_key = "EMPTY"
openai_api_base = "http://localhost:8000/v1"

client = OpenAI(
    api_key=openai_api_key,
    base_url=openai_api_base,
)
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "image": "file:///path/to/image1.jpg"},
            {"type": "image", "image": "file:///path/to/image2.jpg"},
            {"type": "text", "text": "Identify the similarities between these images."},
        ],
    }
]
chat_response = client.chat.completions.create(
    model="Qwen2-7B-Instruct",
    messages=messgaes,
)
print("Chat response:", chat_response)`
uYu commented 2 months ago

重装了最新的vllm,现在可以了

Ulov888 commented 2 months ago

支持的,直接将图加到你想加的位置拼成list就行,可以参考这里

from openai import OpenAI

# Set OpenAI's API key and API base to use vLLM's API server.
openai_api_key = "EMPTY"
openai_api_base = "http://localhost:8000/v1"

client = OpenAI(
    api_key=openai_api_key,
    base_url=openai_api_base,
)
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "image": "file:///path/to/image1.jpg"},
            {"type": "image", "image": "file:///path/to/image2.jpg"},
            {"type": "text", "text": "Identify the similarities between these images."},
        ],
    }
]
chat_response = client.chat.completions.create(
    model="Qwen2-7B-Instruct",
    messages=messgaes,
)
print("Chat response:", chat_response)`

这里的示例不能用,拼接多个image,openai接口

支持的,直接将图加到你想加的位置拼成list就行,可以参考这里

from openai import OpenAI

# Set OpenAI's API key and API base to use vLLM's API server.
openai_api_key = "EMPTY"
openai_api_base = "http://localhost:8000/v1"

client = OpenAI(
    api_key=openai_api_key,
    base_url=openai_api_base,
)
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "image": "file:///path/to/image1.jpg"},
            {"type": "image", "image": "file:///path/to/image2.jpg"},
            {"type": "text", "text": "Identify the similarities between these images."},
        ],
    }
]
chat_response = client.chat.completions.create(
    model="Qwen2-7B-Instruct",
    messages=messgaes,
)
print("Chat response:", chat_response)`

我试了下,openai的接口格式不支持连续多图,会报错 我的请求体是:

messages = [
    {
        "role": "user",
        "content": [
            {"type": "image_url", "image_url": {"url": f"data:image/jpeg;base64,{base64_string1}"}},
            {"type": "image_url", "image_url": {"url": f"data:image/jpeg;base64,{base64_string2}"}},

            {"type": "text", "text": "describe the image."},
        ],
    }
]

错误信息是: openai.BadRequestError: Error code: 400 - {'object': 'error', 'message': 'Multiple multimodal inputs is currently not supported.', 'type': 'BadRequestError', 'param': None, 'code': 400}

uYu commented 2 months ago

需要装这个:https://github.com/fyabc/vllm/tree/add_qwen2_vl_new

skywalkerfmc commented 2 months ago

用vllm部署,调用的时候不支持tool_choice = "auto"参数吗

zailushang2006 commented 2 months ago

需要装这个:https://github.com/fyabc/vllm/tree/add_qwen2_vl_new

你好,为什么我安装了还是不行?

还是只能一张图片。

openai.BadRequestError: Error code: 400 - {'object': 'error', 'message': 'At most 1 image(s) may be provided in one request.', 'type': 'BadRequestError', 'param': None, 'code': 400}

list里两条message如下: { "type": "image_url", "image_url": { "url": "https://dashscope.oss-cn-beijing.aliyuncs.com/images/dog_and_girl.jpeg" } }, { "type": "image_url", "image_url": { "url": f"data:image/jpeg;base64,{base64_image}" } },

另外像这种形式可以跑通吗? {"type": "image", "image": "file:///path/to/image1.jpg"}, Windows系统下路径是什么样式?

Ulov888 commented 2 months ago

需要装这个:https://github.com/fyabc/vllm/tree/add_qwen2_vl_new

你好,为什么我安装了还是不行?

还是只能一张图片。

openai.BadRequestError: Error code: 400 - {'object': 'error', 'message': 'At most 1 image(s) may be provided in one request.', 'type': 'BadRequestError', 'param': None, 'code': 400}openai 的。BadRequestError: 错误代码: 400 - {'object': 'error', 'message': '一个请求中最多可以提供 1 张图像。'type': 'BadRequestError', 'param': 无, 'code': 400}

list里两条message如下: { "type": "image_url", “type”: “image_url”, //系列 "image_url": { “image_url”: { "url": "https://dashscope.oss-cn-beijing.aliyuncs.com/images/dog_and_girl.jpeg"“url”: “https://dashscope.oss-cn-beijing.aliyuncs.com/images/dog_and_girl.jpeg” } }, { "type": "image_url", “type”: “image_url”, //系列 "image_url": { “image_url”: { "url": f"data:image/jpeg;base64,{base64_image}"“url”: f“data:image/jpeg;base64,{base64_image}” } },

另外像这种形式可以跑通吗? {"type": "image", "image": "file:///path/to/image1.jpg"}, Windows系统下路径是什么样式?

你需要修改下vllm的源码,将默认的最大图片数量限制改大即可,从报错处点进去就可以找到

zailushang2006 commented 2 months ago

现在的vllm部署方案是否支持多图? 如果支持,请求是什么样子的?如果不支持,是否将来会支持?

启动服务时,配置参数limit_mm_per_prompt。 例如最多2张图片:--limit-mm-per-prompt image=2