vllm-project / vllm

A high-throughput and memory-efficient inference and serving engine for LLMs
https://docs.vllm.ai
Apache License 2.0
30.69k stars 4.66k forks source link

[Bug]: ValueError: At most 1 image(s) may be provided in one request. #9739

Open eav-solution opened 3 weeks ago

eav-solution commented 3 weeks ago

Your current environment

vllm-openai/v06.3.1.post-1

Model Input Dumps

a_request: None, prompt_adapter_request: None. 2024-10-27 23:04:39 INFO 10-27 09:04:39 engine.py:290] Added request chat-35d8d255cc6f44359126404d2bcefd72. 2024-10-27 23:04:40 INFO: 172.24.0.1:51444 - "POST /v1/chat/completions HTTP/1.1" 200 OK 2024-10-27 23:04:47 ERROR 10-27 09:04:47 serving_chat.py:156] Error in applying chat template from request 2024-10-27 23:04:47 ERROR 10-27 09:04:47 serving_chat.py:156] Traceback (most recent call last): 2024-10-27 23:04:47 ERROR 10-27 09:04:47 serving_chat.py:156] File "/usr/local/lib/python3.12/dist-packages/vllm/entrypoints/openai/serving_chat.py", line 124, in create_chat_completion 2024-10-27 23:04:47 ERROR 10-27 09:04:47 serving_chat.py:156] conversation, mm_data_future = parse_chat_messages_futures( 2024-10-27 23:04:47 ERROR 10-27 09:04:47 serving_chat.py:156] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 2024-10-27 23:04:47 ERROR 10-27 09:04:47 serving_chat.py:156] File "/usr/local/lib/python3.12/dist-packages/vllm/entrypoints/chat_utils.py", line 529, in parse_chat_messages_futures 2024-10-27 23:04:47 ERROR 10-27 09:04:47 serving_chat.py:156] sub_messages = _parse_chat_message_content(msg, mm_tracker) 2024-10-27 23:04:47 ERROR 10-27 09:04:47 serving_chat.py:156] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 2024-10-27 23:04:47 ERROR 10-27 09:04:47 serving_chat.py:156] File "/usr/local/lib/python3.12/dist-packages/vllm/entrypoints/chat_utils.py", line 464, in _parse_chat_message_content 2024-10-27 23:04:47 ERROR 10-27 09:04:47 serving_chat.py:156] result = _parse_chat_message_content_parts( 2024-10-27 23:04:47 ERROR 10-27 09:04:47 serving_chat.py:156] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 2024-10-27 23:04:47 ERROR 10-27 09:04:47 serving_chat.py:156] File "/usr/local/lib/python3.12/dist-packages/vllm/entrypoints/chat_utils.py", line 416, in _parse_chat_message_content_parts 2024-10-27 23:04:47 ERROR 10-27 09:04:47 serving_chat.py:156] mm_parser.parse_image(image_url["url"]) 2024-10-27 23:04:47 ERROR 10-27 09:04:47 serving_chat.py:156] File "/usr/local/lib/python3.12/dist-packages/vllm/entrypoints/chat_utils.py", line 298, in parse_image 2024-10-27 23:04:47 ERROR 10-27 09:04:47 serving_chat.py:156] placeholder = self._tracker.add("image", image_coro) 2024-10-27 23:04:47 ERROR 10-27 09:04:47 serving_chat.py:156] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 2024-10-27 23:04:47 ERROR 10-27 09:04:47 serving_chat.py:156] File "/usr/local/lib/python3.12/dist-packages/vllm/entrypoints/chat_utils.py", line 207, in add 2024-10-27 23:04:47 ERROR 10-27 09:04:47 serving_chat.py:156] raise ValueError( 2024-10-27 23:04:47 ERROR 10-27 09:04:47 serving_chat.py:156] ValueError: At most 1 image(s) may be provided in one request. 2024-10-27 23:04:47 INFO: 172.24.0.1:50008 - "POST /v1/chat/completions HTTP/1.1" 400 Bad Request

🐛 Describe the bug

I run vllm-openai qwen2-vl-7b-instruct and get this error.

Before submitting a new issue...

alex-jw-brooks commented 3 weeks ago

When you start the server, you need to initialize it with limits allowed per modality type, which are set to one by default. E.g., to allow passing up to two images, you would pass the command line arg --limit-mm-per-prompt "image=2".

Similarly, if you are running offline, initialize the LLM with limit_mm_per_prompt, e.g., limit_mm_per_prompt={"image": 2}

eav-solution commented 3 weeks ago

I understood, but it will remove older images if reach certain number of images ? I think it's better than raise an error.

SinanAkkoyun commented 3 weeks ago

@eav-solution It seems as if everything works fine with --limit-mm-per-prompt "image=n", I tested two image Q&A over many texts and it seems to see both perfectly fine in context

eav-solution commented 3 weeks ago

My env wil crash if I set it's too large, current work with 6 but failed if > 6

DarkLight1337 commented 3 weeks ago

My env wil crash if I set it's too large, current work with 6 but failed if > 6

Since each image can take up many tokens, the maximum number of images allowed is limited by the context length of the model.