QwenLM / Qwen2-VL

Qwen2-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.
Apache License 2.0
2.35k stars 130 forks source link

Which transfomer version could be used with VLLM 0.6.2? #282

Open bash99 opened 2 days ago

bash99 commented 2 days ago

VLLM 0.6.2 had just released few hours ago, it said no support multi image inference with Qwen2-VL.

I've try it, but it require the newest transformer and automatic install it.

When I start it use follow script (worked with vllm 0.6.1)

VLLM_WORKER_MULTIPROC_METHOD=spawn CUDA_VISIBLE_DEVICES=0,1 python -m vllm.entrypoints.openai.api_server --served-model-name Qwen2-VL-72B-Instruct-GPTQ-Int4 --model Qwen/Qwen2-VL-72B-Instruct-GPTQ-Int4 --port 7869 --dtype half --trust-remote-code --kv-cache-dtype fp8 -q gptq --disable-log-requests --gpu-memory-utilization 0.998 --max-model-len 24576 --max_n
um_seqs 16 -tp 2

it report error like

  File "/DaTa/.local/home/hai.li/miniforge3/envs/vllm/lib/python3.12/site-packages/vllm/config.py", line 1746, in _get_and_verify_max_len
    assert "factor" in rope_scaling
           ^^^^^^^^^^^^^^^^^^^^^^^^
AssertionError
Unrecognized keys in `rope_scaling` for 'rope_type'='default': {'mrope_section'}

if I return to old transformer with

pip install git+https://github.com/huggingface/transformers@21fac7abba2a37fae86106f87fcf9974fd1e3830

it report error like

  File "/DaTa/.local/home/hai.li/miniforge3/envs/vllm/lib/python3.12/site-packages/vllm/transformers_utils/configs/mll
ama.py", line 1, in <module>
    from transformers.models.mllama import configuration_mllama as mllama_hf_config
ModuleNotFoundError: No module named 'transformers.models.mllama'
xiehust commented 2 days ago

same issue +1

mkaskov commented 2 days ago

the same

xiehust commented 2 days ago

seems the issue is found:https://github.com/vllm-project/vllm/pull/8829

DarkLight1337 commented 2 days ago

We have just now fixed the issue in https://github.com/vllm-project/vllm/pull/8837. Please install vLLM from source to resolve the config loading problem.

verigle commented 1 day ago

We have just now fixed the issue in vllm-project/vllm#8837. Please install vLLM from source to resolve the config loading problem.

vllm 依然不支持多张图片或视频的问答,请问是否有计划修复?

DarkLight1337 commented 1 day ago

We have just now fixed the issue in vllm-project/vllm#8837. Please install vLLM from source to resolve the config loading problem.

vllm 依然不支持多张图片或视频的问答,请问是否有计划修复?

Multi-image input is currently supported in both offline and online inference, while video input is only supported for offline inference at the moment. If you need to pass videos via OpenAI API, you can instead provide multiple images for now. Please check the example in examples/openai_vision_api_client.py (especially the part labelled "Multi-image input inference")

jbohnslav commented 22 hours ago

We have just now fixed the issue in https://github.com/vllm-project/vllm/pull/8837. Please install vLLM from source to resolve the config loading problem.

Can we get a .post0 release for this? Installing from source is a lot more difficult.