[Question]: 在teala t4卡上，基于vllm 运行Qwen2-VL-7B-Instruct，报错Qwen2VLForConditionalGeneration不支持

Has this been raised before?

[X] I have checked the GitHub README.
[X] I have checked the Qwen documentation and cannot find an answer there.
[X] I have searched the issues and there is not a similar one.
[X] I confirm that this is not a bug report, a feature request, or a badcase.

Description

环境： transformers 4.45.0.dev0 vllm 0.6.0

运行的命令： python -m vllm.entrypoints.openai.api_server --model /root/.cache/modelscope/hub/qwen/Qwen2-VL-7B-Instruct/ --dtype=float32
也尝试了half。

另外对config.json做了如下修改 "rope_scaling": { "type": "yarn", "factor": 4.0, "original_max_position_embeddings": 32768 }, 因为如果不做修改，会有如下报错 Unrecognized keys in rope_scaling for 'rope_type'='default': {'mrope_section'} Traceback (most recent call last):

line 1738, in _get_and_verify_max_len assert "factor" in rope_scaling

报错内容： raise ValueError( ValueError: Model architectures ['Qwen2VLForConditionalGeneration'] are not supported for now. Supported architectures: ['AquilaModel', 'AquilaForCausalLM', 'BaiChuanForCausalLM', 'BaichuanForCausalLM', 'BloomForCausalLM', 'ChatGLMModel', 'ChatGLMForConditionalGeneration', 'CohereForCausalLM', 'DbrxForCausalLM', 'DeciLMForCausalLM', 'DeepseekForCausalLM', 'DeepseekV2ForCausalLM', 'ExaoneForCausalLM', 'FalconForCausalLM', 'GemmaForCausalLM', 'Gemma2ForCausalLM', 'GPT2LMHeadModel', 'GPTBigCodeForCausalLM', 'GPTJForCausalLM', 'GPTNeoXForCausalLM', 'InternLMForCausalLM', 'InternLM2ForCausalLM', 'JAISLMHeadModel', 'LlamaForCausalLM', 'LLaMAForCausalLM', 'MistralForCausalLM', 'MixtralForCausalLM', 'QuantMixtralForCausalLM', 'MptForCausalLM', 'MPTForCausalLM', 'MiniCPMForCausalLM', 'NemotronForCausalLM', 'OlmoForCausalLM', 'OPTForCausalLM', 'OrionForCausalLM', 'PersimmonForCausalLM', 'PhiForCausalLM', 'Phi3ForCausalLM', 'PhiMoEForCausalLM', 'QWenLMHeadModel', 'Qwen2ForCausalLM', 'Qwen2MoeForCausalLM', 'RWForCausalLM', 'StableLMEpochForCausalLM', 'StableLmForCausalLM', 'Starcoder2ForCausalLM', 'ArcticForCausalLM', 'XverseForCausalLM', 'Phi3SmallForCausalLM', 'MedusaModel', 'EAGLEModel', 'MLPSpeculatorPreTrainedModel', 'JambaForCausalLM', 'GraniteForCausalLM', 'MistralModel', 'Blip2ForConditionalGeneration', 'ChameleonForConditionalGeneration', 'FuyuForCausalLM', 'InternVLChatModel', 'LlavaForConditionalGeneration', 'LlavaNextForConditionalGeneration', 'MiniCPMV', 'PaliGemmaForConditionalGeneration', 'Phi3VForCausalLM', 'UltravoxModel', 'BartModel', 'BartForConditionalGeneration'] ERROR 09-18 06:40:22 api_server.py:186] RPCServer process died before responding to readiness probe

QwenLM / Qwen2-VL

[Question]: 在teala t4卡上，基于vllm 运行Qwen2-VL-7B-Instruct，报错Qwen2VLForConditionalGeneration不支持 #223

Has this been raised before?

Description