Open Cherryjingyao opened 1 week ago
抱歉,目前不支持视频作为输入。建议抽取视频帧后使用多个图像输入,避免序列过长,建议抽帧FPS=2。
那请问我用vllm做server,
curl http://localhost:8000/v1/chat/completions \ -H "Content-Type: application/json" \ -d '{ "model": "Qwen2-VL-7B-Instruct", "messages": [ {"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": [ {"type": "image_url", "image_url": {"url": "https://modelscope.oss-cn-beijing.aliyuncs.com/resource/qwen.png"}}, {"type": "text", "text": "What is the text in the illustrate?"} ]} ] }'
我该如何输入多个图像输入表示视频呢?
貌似只支持image_url
,不支持image
和video
字段
改成 "type": "video", "video": "xxx" 就行
https://github.com/QwenLM/Qwen2-VL?tab=readme-ov-file#deployment
改成 "type": "video", "video": "xxx" 就行
https://github.com/QwenLM/Qwen2-VL?tab=readme-ov-file#deployment
我就是用这个部署的,然后测试,发现错误
{"object":"error","message":"Unknown part type: video","type":"BadRequestError","param":null,"code":400}
我发现vllm的entrypoitn里面根本就没有video
这个type
https://github.com/fyabc/vllm/blob/add_qwen2_vl_new/vllm/entrypoints/chat_utils.py
Traceback (most recent call last): File "/pfs-data/code/Success_VQA/test_demo/Qwen2-VL/agent_demo.py", line 99, in
for response in bot.run(messages=messages):
File "/data/anaconda3/envs/llava/lib/python3.10/site-packages/qwen_agent/agent.py", line 83, in run
new_messages.append(Message(**msg))
File "/data/anaconda3/envs/llava/lib/python3.10/site-packages/qwen_agent/llm/schema.py", line 114, in init
super().init(role=role, content=content, name=name, function_call=function_call)
File "/data/anaconda3/envs/llava/lib/python3.10/site-packages/pydantic/main.py", line 193, in init
self.__pydantic_validator.validate_python(data, self_instance=self)
TypeError: ContentItem.init() got an unexpected keyword argument 'video'
Exception ignored in: <function CodeInterpreter.del at 0x7ff27a763490>
Traceback (most recent call last):
File "/data/anaconda3/envs/llava/lib/python3.10/site-packages/qwen_agent/tools/code_interpreter.py", line 124, in del__
TypeError: 'NoneType' object is not callable
调用agent,把图像换为video出错,想请教一下是否支持视频作为输入,如果支持调用样例是什么。