Closed younger-diao closed 4 weeks ago
视频推理需要传参数, vllm 官方暂不支持,建议使用我们fork的代码
git clone https://github.com/OpenBMB/vllm
cd vllm
git checkout minicpmv
pip install e .
另外能否再提供一下最外层调用vllm进行推理的代码
from transformers import AutoTokenizer from decord import VideoReader, cpu from PIL import Image from vllm import LLM, SamplingParams import time
MAX_NUM_FRAMES = 32 def encode_video(filepath): def uniform_sample(l, n): gap = len(l) / n idxs = [int(i * gap + gap / 2) for i in range(n)] return [l[i] for i in idxs] vr = VideoReader(filepath, ctx=cpu(0)) sample_fps = round(vr.get_avg_fps() / 1) # FPS frame_idx = [i for i in range(0, len(vr), sample_fps)] if len(frame_idx)>MAX_NUM_FRAMES: frame_idx = uniform_sample(frame_idx, MAX_NUM_FRAMES) video = vr.get_batch(frame_idx).asnumpy() video = [Image.fromarray(v.astype('uint8')) for v in video] return video
MODEL_NAME = "/data/models/MiniCPM-V-2_6" # or local model path
tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME, trust_remote_code=True)
llm = LLM(
model=MODEL_NAME,
gpu_memory_utilization=1,
trust_remote_code=True,
max_model_len=4096
)
start = time.time()
video = encode_video("/data/tokyo_people.mp4")
messages = [{
"role":
"user",
"content":
"".join(["(
prompt = tokenizer.apply_chat_template( messages, tokenize=False, add_generation_prompt=True )
stop_tokens = ['<|im_end|>', '<|endoftext|>'] stop_token_ids = [tokenizer.convert_tokens_to_ids(i) for i in stop_tokens]
sampling_params = SamplingParams( stop_token_ids=stop_token_ids, use_beam_search=True, temperature=0,
# top_k=100,
# repetition_penalty=1.05,
max_tokens=64,
best_of=3
)
outputs = llm.generate({ "prompt": prompt, "multi_modal_data": { "image": { "images": video, "use_image_id": False, "max_slice_nums": 1 if len(video) > 16 else 2 } } }, sampling_params=sampling_params)
finish = time.time() print('Predicted in %f seconds.' % ((finish - start))) print(outputs[0].outputs[0].text)
使用 video 需要加入一些参数,vllm 官方支持的方式是
outputs = llm.generate({
"prompt": prompt,
"multi_modal_data": {
"image": image # or [image] * len
}
}, sampling_params=sampling_params)
我们的视频会引入新的参数,如你代码里面写的,vllm暂时不支持,所以我们 fork 了一个仓库来进行支持。你可以试试我们的仓库你是否可以跑通。
如果不想拉取仓库进行安装还可以试试直接去 config.json
和 preprocessor_config.json
里面改对应的参数("use_image_id", "max_slice_nums"),这里就保持不送参数的调用不变了。
使用 video 需要加入一些参数,vllm 官方支持的方式是
outputs = llm.generate({ "prompt": prompt, "multi_modal_data": { "image": image # or [image] * len } }, sampling_params=sampling_params)
我们的视频会引入新的参数,如你代码里面写的,vllm暂时不支持,所以我们 fork 了一个仓库来进行支持。你可以试试我们的仓库你是否可以跑通。 如果不想拉取仓库进行安装还可以试试直接去
config.json
和preprocessor_config.json
里面改对应的参数("use_image_id", "max_slice_nums"),这里就保持不送参数的调用不变了。
多谢指点
使用 video 需要加入一些参数,vllm 官方支持的方式是
outputs = llm.generate({ "prompt": prompt, "multi_modal_data": { "image": image # or [image] * len } }, sampling_params=sampling_params)
我们的视频会引入新的参数,如你代码里面写的,vllm暂时不支持,所以我们 fork 了一个仓库来进行支持。你可以试试我们的仓库你是否可以跑通。 如果不想拉取仓库进行安装还可以试试直接去
config.json
和preprocessor_config.json
里面改对应的参数("use_image_id", "max_slice_nums"),这里就保持不送参数的调用不变了。
能否提供正确的vllm调用代码,按照官方飞书里的代码无法跑通
是否已有关于该错误的issue或讨论? | Is there an existing issue / discussion for this?
该问题是否在FAQ中有解答? | Is there an existing answer for this in FAQ?
当前行为 | Current Behavior
1、实例代码错误;2、vllm/model_executor/models/minicpmv.py 468行get_placeholder(images[i].size, i)报错 rank0: Traceback (most recent call last): rank0: File "/data/diaohf/Multi-Model/MiniCPM-V-2.6/demo_vllm_video.py", line 61, in
rank0: outputs = llm.generate({
rank0: File "/data/diaohf/anaconda3/envs/MiniCPMV2.6/lib/python3.10/site-packages/vllm/utils.py", line 895, in inner
rank0: return fn(*args, **kwargs)
rank0: File "/data/diaohf/anaconda3/envs/MiniCPMV2.6/lib/python3.10/site-packages/vllm/entrypoints/llm.py", line 323, in generate
rank0: File "/data/diaohf/anaconda3/envs/MiniCPMV2.6/lib/python3.10/site-packages/vllm/entrypoints/llm.py", line 552, in _validate_and_add_requests
rank0: File "/data/diaohf/anaconda3/envs/MiniCPMV2.6/lib/python3.10/site-packages/vllm/entrypoints/llm.py", line 568, in _add_request
rank0: File "/data/diaohf/anaconda3/envs/MiniCPMV2.6/lib/python3.10/site-packages/vllm/engine/llm_engine.py", line 654, in add_request rank0: processed_inputs = self.process_model_inputs( rank0: File "/data/diaohf/anaconda3/envs/MiniCPMV2.6/lib/python3.10/site-packages/vllm/engine/llm_engine.py", line 594, in process_model_inputs rank0: return self.input_processor(llm_inputs) rank0: File "/data/diaohf/anaconda3/envs/MiniCPMV2.6/lib/python3.10/site-packages/vllm/inputs/registry.py", line 202, in process_input rank0: return processor(InputContext(model_config), inputs) rank0: File "/data/diaohf/anaconda3/envs/MiniCPMV2.6/lib/python3.10/site-packages/vllm/model_executor/models/minicpmv.py", line 471, in input_processor_for_minicpmv rank0: get_placeholder(images[i].size, i) rank0: KeyError: 0
期望行为 | Expected Behavior
修复bug
复现方法 | Steps To Reproduce
No response
运行环境 | Environment
备注 | Anything else?
No response