Closed BabyChouSr closed 2 months ago
Can you check out #7710 and see if it fixes your issue?
@DarkLight1337 is this currently fixed? I am still getting the same error with the Dockerfile.cpu in this tutorial.
@DarkLight1337 is this currently fixed? I am still getting the same error with the Dockerfile.cpu in this tutorial.
Which version of vLLM are you using?
@DarkLight1337 is this currently fixed? I am still getting the same error with the Dockerfile.cpu in this tutorial.
Which version of vLLM are you using?
0.5.5
.
I pulled from the source yesterday, so I assume that is the latest available version.
I also tried adding a separate RUN pip install vllm==0.5.5
into the Docker to make sure it also happens in latest release.
Text only inference works fine for me (just text messages without any image), but, still getting the following errors with the image inputs:
ValueError: Attempted to assign 1921 = 1921 multimodal tokens to 0 placeholders
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "uvloop/cbhandles.pyx", line 63, in uvloop.loop.Handle._run
File "/usr/local/lib/python3.10/dist-packages/vllm/engine/async_llm_engine.py", line 64, in _log_task_completion
This also happens with the microsoft/Phi-3-vision-128k-instruct
, not only microsoft/Phi-3.5-vision-instruct
.
@DarkLight1337 is this currently fixed? I am still getting the same error with the Dockerfile.cpu in this tutorial.
Which version of vLLM are you using?
0.5.5
. I pulled from the source yesterday, so I assume that is the latest available version. I also tried adding a separateRUN pip install vllm==0.5.5
into the Docker to make sure it also happens in latest release.Text only inference works fine for me (just text messages without any image), but, still getting the following errors with the image inputs:
ValueError: Attempted to assign 1921 = 1921 multimodal tokens to 0 placeholders The above exception was the direct cause of the following exception: Traceback (most recent call last): File "uvloop/cbhandles.pyx", line 63, in uvloop.loop.Handle._run File "/usr/local/lib/python3.10/dist-packages/vllm/engine/async_llm_engine.py", line 64, in _log_task_completion
This also happens with the
microsoft/Phi-3-vision-128k-instruct
, not onlymicrosoft/Phi-3.5-vision-instruct
.
You may have to increase the max_model_len
as multimodal tokens count towards the limit. Any excess tokens will be truncated.
@DarkLight1337 is this currently fixed? I am still getting the same error with the Dockerfile.cpu in this tutorial.
Which version of vLLM are you using?
0.5.5
. I pulled from the source yesterday, so I assume that is the latest available version. I also tried adding a separateRUN pip install vllm==0.5.5
into the Docker to make sure it also happens in latest release. Text only inference works fine for me (just text messages without any image), but, still getting the following errors with the image inputs:ValueError: Attempted to assign 1921 = 1921 multimodal tokens to 0 placeholders The above exception was the direct cause of the following exception: Traceback (most recent call last): File "uvloop/cbhandles.pyx", line 63, in uvloop.loop.Handle._run File "/usr/local/lib/python3.10/dist-packages/vllm/engine/async_llm_engine.py", line 64, in _log_task_completion
This also happens with the
microsoft/Phi-3-vision-128k-instruct
, not onlymicrosoft/Phi-3.5-vision-instruct
.You may have to increase the
max_model_len
as multimodal tokens count towards the limit. Any excess tokens will be truncated.
I tried with larger max_model_len
(80.000) as well as without limiting it, still getting the same error.
I get this error on a CPU only machine. I had been running it without any errors on another machine with a GPU.
(VllmWorkerProcess pid=234352) ERROR 08-27 10:30:28 multiproc_worker_utils.py:226] File "/home/{USER_NAME}/miniforge3/envs/vllm2/lib/python3.10/site-packages/vllm-0.5.5+cpu-py3.10-linux-x86_64.egg/vllm/model_executor/models/utils.py", line 88, in merge_multimodal_embeddings
(VllmWorkerProcess pid=234352) ERROR 08-27 10:30:28 multiproc_worker_utils.py:226] raise ValueError(
(VllmWorkerProcess pid=234352) ERROR 08-27 10:30:28 multiproc_worker_utils.py:226] ValueError: Attempted to assign 1921 = 1921 multimodal tokens to 0 placeholders
@DarkLight1337 this is the exact error I have. I get it in both the Docker and outside of the Docker.
@Isotr0py since you have a CPU-only environment (and also implemented this model), can you help investigate this? Thanks!
Ok, I will investigate this tonight.
Small addition @DarkLight1337 @Isotr0py ,
from vllm import LLM
from transformers import AutoTokenizer
from vllm import LLM, SamplingParams
from vllm.assets.image import ImageAsset
from vllm.utils import FlexibleArgumentParser
llm = LLM(
model="microsoft/Phi-3.5-vision-instruct",
trust_remote_code=True
)
Image inputs work without any issues when I use the LLM as above with llm.generate...
, however, OpenAI mimicking (python -m vllm.entrypoints.openai.api_server --model microsoft/Phi-3.5-vision-instruct --trust-remote-code
) still fails with the error above.
Please note that multi-image support is not supported yet for OpenAI-compatible server. Can you provide a minimum reproducible example?
Please note that multi-image support is not supported yet for OpenAI-compatible server. Can you provide a minimum reproducible example?
Sure, after running with the above instructions,
run the following:
from openai import OpenAI
openai_api_key = "EMPTY"
openai_api_base = "http://localhost:8001/v1". #### make sure this port is correct, I changed it to 8001 in server
client = OpenAI(
api_key=openai_api_key,
base_url=openai_api_base,
)
chat_response = client.chat.completions.create(
model="microsoft/Phi-3.5-vision-instruct",
messages=[{
"role": "user",
"content": [
{"type": "text", "text": "What's in this image?"},
{
"type": "image_url",
"image_url": {
"url": "https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg",
},
},
],
}],
)
print("Chat response:", chat_response)
@berkecanrizai I have created #7916 to fix this. Please take a look at this :)
@berkecanrizai I have created #7916 to fix this. Please take a look at this :)
Thanks, that was fast :D
Your current environment
vllm version:
Version: 0.5.4
🐛 Describe the bug
Repro command:
Error: