vllm-project / vllm

A high-throughput and memory-efficient inference and serving engine for LLMs
https://docs.vllm.ai
Apache License 2.0
26.59k stars 3.9k forks source link

[Bug]: Unable to load vision model llava 1.5 7b with tensor-parallel-size > 1 using vllm 0.4.0.post1 #4010

Closed stikkireddy closed 3 months ago

stikkireddy commented 5 months ago

Your current environment

The output of `python collect_env.py`

🐛 Describe the bug

My expectation is that the model should properly load the language portion of the model into the different gpus i have available.

error: AssertionError: Provideimage_input_typeand other vision related configurations through LLM entrypoint or engine arguments.

But I seem to be providing it based on the example tests we have.

from vllm import LLM

model = LLM(
            model="llava-hf/llava-1.5-7b-hf",
            image_input_type="pixel_values",
            download_dir="/tmp/models",
            image_token_id=32000,
            image_input_shape="1,3,336,336",
            image_feature_size=576,
            tensor_parallel_size=2
        )

(RayWorkerVllm pid=23847) ERROR 04-11 13:22:43 ray_utils.py:44] Error executing method load_model. This might cause deadlock in distributed execution. (RayWorkerVllm pid=23847) ERROR 04-11 13:22:43 ray_utils.py:44] Traceback (most recent call last): (RayWorkerVllm pid=23847) ERROR 04-11 13:22:43 ray_utils.py:44] File "/local_disk0/.ephemeral_nfs/envs/pythonEnv-8b85b5ac-0966-40f5-8881-cf111940a211/lib/python3.10/site-packages/vllm/engine/ray_utils.py", line 37, in execute_method (RayWorkerVllm pid=23847) ERROR 04-11 13:22:43 ray_utils.py:44] return executor(*args, *kwargs) (RayWorkerVllm pid=23847) ERROR 04-11 13:22:43 ray_utils.py:44] File "/local_disk0/.ephemeral_nfs/envs/pythonEnv-8b85b5ac-0966-40f5-8881-cf111940a211/lib/python3.10/site-packages/vllm/worker/worker.py", line 107, in load_model (RayWorkerVllm pid=23847) ERROR 04-11 13:22:43 ray_utils.py:44] self.model_runner.load_model() (RayWorkerVllm pid=23847) ERROR 04-11 13:22:43 ray_utils.py:44] File "/local_disk0/.ephemeral_nfs/envs/pythonEnv-8b85b5ac-0966-40f5-8881-cf111940a211/lib/python3.10/site-packages/vllm/worker/model_runner.py", line 95, in load_model (RayWorkerVllm pid=23847) ERROR 04-11 13:22:43 ray_utils.py:44] self.model = get_model( (RayWorkerVllm pid=23847) ERROR 04-11 13:22:43 ray_utils.py:44] File "/local_disk0/.ephemeral_nfs/envs/pythonEnv-8b85b5ac-0966-40f5-8881-cf111940a211/lib/python3.10/site-packages/vllm/model_executor/model_loader.py", line 93, in get_model (RayWorkerVllm pid=23847) ERROR 04-11 13:22:43 ray_utils.py:44] model = model_class(model_config.hf_config, (RayWorkerVllm pid=23847) ERROR 04-11 13:22:43 ray_utils.py:44] File "/local_disk0/.ephemeral_nfs/envs/pythonEnv-8b85b5ac-0966-40f5-8881-cf111940a211/lib/python3.10/site-packages/vllm/model_executor/models/llava.py", line 71, in init (RayWorkerVllm pid=23847) ERROR 04-11 13:22:43 ray_utils.py:44] assert self.vision_language_config, ( (RayWorkerVllm pid=23847) ERROR 04-11 13:22:43 ray_utils.py:44] AssertionError: Provide image_input_type and other vision related configurations through LLM entrypoint or engine arguments. INFO 04-11 13:22:43 weight_utils.py:177] Using model weights format ['.safetensors']

Isotr0py commented 5 months ago

I think #3883 should have fixed this. You can try to build from main branch to make it work.