Describe the bug
The official VLLM Wiki claims support for Phi-3-Vision (microsoft/Phi-3-vision-128k-instruct, Phi3VForCausalLM), but when I try to run it I get the following error:
`rank0: Traceback (most recent call last):
rank0: File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main
rank0: return _run_code(code, main_globals, None,
rank0: File "/usr/lib/python3.10/runpy.py", line 86, in _run_code
rank0: exec(code, run_globals)
rank0: File "/usr/local/lib/python3.10/dist-packages/vllm/entrypoints/openai/api_server.py", line 196, in rank0: engine = AsyncLLMEngine.from_engine_args(
rank0: File "/usr/local/lib/python3.10/dist-packages/vllm/engine/async_llm_engine.py", line 398, in from_engine_args
rank0: engine = cls(
rank0: File "/usr/local/lib/python3.10/dist-packages/vllm/engine/async_llm_engine.py", line 349, in initrank0: self.engine = self._init_engine(*args, *kwargs)
rank0: File "/usr/local/lib/python3.10/dist-packages/vllm/engine/async_llm_engine.py", line 473, in _init_engine
rank0: return engine_class(args, **kwargs)
rank0: File "/usr/local/lib/python3.10/dist-packages/vllm/engine/llm_engine.py", line 223, in initrank0: self.model_executor = executor_class(
rank0: File "/usr/local/lib/python3.10/dist-packages/vllm/executor/executor_base.py", line 41, in init
rank0: File "/usr/local/lib/python3.10/dist-packages/vllm/executor/gpu_executor.py", line 24, in _init_executor
rank0: File "/usr/local/lib/python3.10/dist-packages/vllm/worker/worker.py", line 122, in load_model
rank0: File "/usr/local/lib/python3.10/dist-packages/vllm/worker/model_runner.py", line 148, in load_model
rank0: self.model = get_model(
rank0: File "/usr/local/lib/python3.10/dist-packages/vllm/model_executor/model_loader/init.py", line 21, in get_model
rank0: return loader.load_model(model_config=model_config,
rank0: File "/usr/local/lib/python3.10/dist-packages/vllm/model_executor/model_loader/loader.py", line 261, in load_model
rank0: model = _initialize_model(model_config, self.load_config,
rank0: File "/usr/local/lib/python3.10/dist-packages/vllm/model_executor/model_loader/loader.py", line 95, in _initialize_model
rank0: model_class = get_model_architecture(model_config)0: File "/usr/local/lib/python3.10/dist-packages/vllm/model_executor/model_loader/utils.py", line 35, in get_model_architecture
rank0: raise ValueError(
rank0: ValueError: Model architectures ['Phi3VForCausalLM'] are not supported for now. Supported architectures: ['AquilaModel', 'AquilaForCausalLM', 'BaiChuanForCausalLM', 'BaichuanForCausalLM', 'BloomForCausalLM', 'ChatGLMModel', 'ChatGLMForConditionalGeneration', 'CohereForCausalLM', 'DbrxForCausalLM', 'DeciLMForCausalLM', 'DeepseekForCausalLM', 'FalconForCausalLM', 'GemmaForCausalLM', 'GPT2LMHeadModel', 'GPTBigCodeForCausalLM', 'GPTJForCausalLM', 'GPTNeoXForCausalLM', 'InternLMForCausalLM', 'InternLM2ForCausalLM', 'JAISLMHeadModel', 'LlamaForCausalLM', 'LlavaForConditionalGeneration', 'LlavaNextForConditionalGeneration', 'LLaMAForCausalLM', 'MistralForCausalLM', 'MixtralForCausalLM', 'QuantMixtralForCausalLM', 'MptForCausalLM', 'MPTForCausalLM', 'MiniCPMForCausalLM', 'OlmoForCausalLM', 'OPTForCausalLM', 'OrionForCausalLM', 'PhiForCausalLM', 'Phi3ForCausalLM', 'QWenLMHeadModel', 'Qwen2ForCausalLM', 'Qwen2MoeForCausalLM', 'RWForCausalLM', 'StableLMEpochForCausalLM', 'StableLmForCausalLM', 'Starcoder2ForCausalLM', 'ArcticForCausalLM', 'XverseForCausalLM', 'Phi3SmallForCausalLM', 'MistralModel']`
Your hardware and system info
MS Windows 10 + WSL2 Ubuntu 22.04 cuda 12.2.
Describe the bug The official VLLM Wiki claims support for Phi-3-Vision (microsoft/Phi-3-vision-128k-instruct, Phi3VForCausalLM), but when I try to run it I get the following error: `rank0: Traceback (most recent call last): rank0: File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main rank0: return _run_code(code, main_globals, None, rank0: File "/usr/lib/python3.10/runpy.py", line 86, in _run_code rank0: exec(code, run_globals) rank0: File "/usr/local/lib/python3.10/dist-packages/vllm/entrypoints/openai/api_server.py", line 196, in
rank0: engine = AsyncLLMEngine.from_engine_args(
rank0: File "/usr/local/lib/python3.10/dist-packages/vllm/engine/async_llm_engine.py", line 398, in from_engine_args
rank0: engine = cls(
rank0: File "/usr/local/lib/python3.10/dist-packages/vllm/engine/async_llm_engine.py", line 349, in init
rank0: self.engine = self._init_engine(*args, *kwargs)
rank0: File "/usr/local/lib/python3.10/dist-packages/vllm/engine/async_llm_engine.py", line 473, in _init_engine
rank0: return engine_class(args, **kwargs)
rank0: File "/usr/local/lib/python3.10/dist-packages/vllm/engine/llm_engine.py", line 223, in init
rank0: self.model_executor = executor_class(
rank0: File "/usr/local/lib/python3.10/dist-packages/vllm/executor/executor_base.py", line 41, in init
rank0: File "/usr/local/lib/python3.10/dist-packages/vllm/executor/gpu_executor.py", line 24, in _init_executor
rank0: File "/usr/local/lib/python3.10/dist-packages/vllm/worker/worker.py", line 122, in load_model
rank0: File "/usr/local/lib/python3.10/dist-packages/vllm/worker/model_runner.py", line 148, in load_model rank0: self.model = get_model( rank0: File "/usr/local/lib/python3.10/dist-packages/vllm/model_executor/model_loader/init.py", line 21, in get_model rank0: return loader.load_model(model_config=model_config, rank0: File "/usr/local/lib/python3.10/dist-packages/vllm/model_executor/model_loader/loader.py", line 261, in load_model rank0: model = _initialize_model(model_config, self.load_config, rank0: File "/usr/local/lib/python3.10/dist-packages/vllm/model_executor/model_loader/loader.py", line 95, in _initialize_model rank0: model_class = get_model_architecture(model_config)0: File "/usr/local/lib/python3.10/dist-packages/vllm/model_executor/model_loader/utils.py", line 35, in get_model_architecture rank0: raise ValueError( rank0: ValueError: Model architectures ['Phi3VForCausalLM'] are not supported for now. Supported architectures: ['AquilaModel', 'AquilaForCausalLM', 'BaiChuanForCausalLM', 'BaichuanForCausalLM', 'BloomForCausalLM', 'ChatGLMModel', 'ChatGLMForConditionalGeneration', 'CohereForCausalLM', 'DbrxForCausalLM', 'DeciLMForCausalLM', 'DeepseekForCausalLM', 'FalconForCausalLM', 'GemmaForCausalLM', 'GPT2LMHeadModel', 'GPTBigCodeForCausalLM', 'GPTJForCausalLM', 'GPTNeoXForCausalLM', 'InternLMForCausalLM', 'InternLM2ForCausalLM', 'JAISLMHeadModel', 'LlamaForCausalLM', 'LlavaForConditionalGeneration', 'LlavaNextForConditionalGeneration', 'LLaMAForCausalLM', 'MistralForCausalLM', 'MixtralForCausalLM', 'QuantMixtralForCausalLM', 'MptForCausalLM', 'MPTForCausalLM', 'MiniCPMForCausalLM', 'OlmoForCausalLM', 'OPTForCausalLM', 'OrionForCausalLM', 'PhiForCausalLM', 'Phi3ForCausalLM', 'QWenLMHeadModel', 'Qwen2ForCausalLM', 'Qwen2MoeForCausalLM', 'RWForCausalLM', 'StableLMEpochForCausalLM', 'StableLmForCausalLM', 'Starcoder2ForCausalLM', 'ArcticForCausalLM', 'XverseForCausalLM', 'Phi3SmallForCausalLM', 'MistralModel']`
Your hardware and system info MS Windows 10 + WSL2 Ubuntu 22.04 cuda 12.2.