现在的vllm部署方案是否支持多图？

uYu commented 2 months ago

现在的vllm部署方案是否支持多图？如果支持，请求是什么样子的？如果不支持，是否将来会支持？

ShuaiBai623 commented 2 months ago

支持的,直接将图加到你想加的位置拼成list就行，可以参考这里

from openai import OpenAI

# Set OpenAI's API key and API base to use vLLM's API server.
openai_api_key = "EMPTY"
openai_api_base = "http://localhost:8000/v1"

client = OpenAI(
    api_key=openai_api_key,
    base_url=openai_api_base,
)
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "image": "file:///path/to/image1.jpg"},
            {"type": "image", "image": "file:///path/to/image2.jpg"},
            {"type": "text", "text": "Identify the similarities between these images."},
        ],
    }
]
chat_response = client.chat.completions.create(
    model="Qwen2-7B-Instruct",
    messages=messgaes,
)
print("Chat response:", chat_response)`

uYu commented 2 months ago

重装了最新的vllm，现在可以了

Ulov888 commented 2 months ago

支持的,直接将图加到你想加的位置拼成list就行，可以参考这里

from openai import OpenAI

# Set OpenAI's API key and API base to use vLLM's API server.
openai_api_key = "EMPTY"
openai_api_base = "http://localhost:8000/v1"

client = OpenAI(
    api_key=openai_api_key,
    base_url=openai_api_base,
)
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "image": "file:///path/to/image1.jpg"},
            {"type": "image", "image": "file:///path/to/image2.jpg"},
            {"type": "text", "text": "Identify the similarities between these images."},
        ],
    }
]
chat_response = client.chat.completions.create(
    model="Qwen2-7B-Instruct",
    messages=messgaes,
)
print("Chat response:", chat_response)`

这里的示例不能用，拼接多个image，openai接口

支持的,直接将图加到你想加的位置拼成list就行，可以参考这里

from openai import OpenAI

# Set OpenAI's API key and API base to use vLLM's API server.
openai_api_key = "EMPTY"
openai_api_base = "http://localhost:8000/v1"

client = OpenAI(
    api_key=openai_api_key,
    base_url=openai_api_base,
)
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "image": "file:///path/to/image1.jpg"},
            {"type": "image", "image": "file:///path/to/image2.jpg"},
            {"type": "text", "text": "Identify the similarities between these images."},
        ],
    }
]
chat_response = client.chat.completions.create(
    model="Qwen2-7B-Instruct",
    messages=messgaes,
)
print("Chat response:", chat_response)`

我试了下，openai的接口格式不支持连续多图，会报错我的请求体是：

messages = [
    {
        "role": "user",
        "content": [
            {"type": "image_url", "image_url": {"url": f"data:image/jpeg;base64,{base64_string1}"}},
            {"type": "image_url", "image_url": {"url": f"data:image/jpeg;base64,{base64_string2}"}},

            {"type": "text", "text": "describe the image."},
        ],
    }
]

错误信息是： openai.BadRequestError: Error code: 400 - {'object': 'error', 'message': 'Multiple multimodal inputs is currently not supported.', 'type': 'BadRequestError', 'param': None, 'code': 400}

uYu commented 2 months ago

需要装这个：https://github.com/fyabc/vllm/tree/add_qwen2_vl_new

skywalkerfmc commented 2 months ago

用vllm部署，调用的时候不支持tool_choice = "auto"参数吗

zailushang2006 commented 2 months ago

需要装这个：https://github.com/fyabc/vllm/tree/add_qwen2_vl_new

你好，为什么我安装了还是不行？

还是只能一张图片。

openai.BadRequestError: Error code: 400 - {'object': 'error', 'message': 'At most 1 image(s) may be provided in one request.', 'type': 'BadRequestError', 'param': None, 'code': 400}

list里两条message如下： { "type": "image_url", "image_url": { "url": "https://dashscope.oss-cn-beijing.aliyuncs.com/images/dog_and_girl.jpeg" } }, { "type": "image_url", "image_url": { "url": f"data:image/jpeg;base64,{base64_image}" } },

另外像这种形式可以跑通吗？ {"type": "image", "image": "file:///path/to/image1.jpg"}, Windows系统下路径是什么样式？

Ulov888 commented 2 months ago

需要装这个：https://github.com/fyabc/vllm/tree/add_qwen2_vl_new

你好，为什么我安装了还是不行？

还是只能一张图片。

openai.BadRequestError: Error code: 400 - {'object': 'error', 'message': 'At most 1 image(s) may be provided in one request.', 'type': 'BadRequestError', 'param': None, 'code': 400}openai 的。BadRequestError：错误代码： 400 - {'object'： 'error'， 'message'： '一个请求中最多可以提供 1 张图像。'type'： 'BadRequestError'， 'param'：无， 'code'： 400}

list里两条message如下： { "type": "image_url", “type”： “image_url”， //系列 "image_url": { “image_url”： { "url": "https://dashscope.oss-cn-beijing.aliyuncs.com/images/dog_and_girl.jpeg"“url”： “https://dashscope.oss-cn-beijing.aliyuncs.com/images/dog_and_girl.jpeg” } }, { "type": "image_url", “type”： “image_url”， //系列 "image_url": { “image_url”： { "url": f"data:image/jpeg;base64,{base64_image}"“url”： f“data：image/jpeg;base64，{base64_image}” } },

另外像这种形式可以跑通吗？ {"type": "image", "image": "file:///path/to/image1.jpg"}, Windows系统下路径是什么样式？

你需要修改下vllm的源码，将默认的最大图片数量限制改大即可，从报错处点进去就可以找到

zailushang2006 commented 2 months ago

现在的vllm部署方案是否支持多图？如果支持，请求是什么样子的？如果不支持，是否将来会支持？

启动服务时，配置参数limit_mm_per_prompt。例如最多2张图片：--limit-mm-per-prompt image=2

QwenLM / Qwen2-VL

现在的vllm部署方案是否支持多图？ #63