OpenGVLab / InternVL

[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型
https://internvl.readthedocs.io/en/latest/
MIT License
5.54k stars 432 forks source link

[ModelScope][InternVL2-4B]NameError: name 'VideoReader' is not defined #379

Closed zjykzj closed 2 months ago

zjykzj commented 2 months ago

First, thank you very much for your open-source work!!!

I plan to use https://modelscope.cn/models/OpenGVLab/InternVL2-4B and encountered the following error while executing the example program

def load_video(video_path, bound=None, input_size=448, max_num=1, num_segments=32):
    vr = VideoReader(video_path, ctx=cpu(0), num_threads=1)
    max_frame = len(vr) - 1
    fps = float(vr.get_avg_fps())

    pixel_values_list, num_patches_list = [], []
    transform = build_transform(input_size=input_size)
    frame_indices = get_index(bound, fps, max_frame, first_idx=0, num_segments=num_segments)
    for frame_index in frame_indices:
        img = Image.fromarray(vr[frame_index].asnumpy()).convert('RGB')
        img = dynamic_preprocess(img, image_size=input_size, use_thumbnail=True, max_num=max_num)
        pixel_values = [transform(tile) for tile in img]
        pixel_values = torch.stack(pixel_values)
        num_patches_list.append(pixel_values.shape[0])
        pixel_values_list.append(pixel_values)
    pixel_values = torch.cat(pixel_values_list)
    return pixel_values, num_patches_list
...
...
Traceback (most recent call last):
  File "/data/zj/llm/LLMSamples/internval-4b.py", line 228, in <module>
    pixel_values, num_patches_list = load_video(video_path, num_segments=8, max_num=2)
  File "/data/zj/llm/LLMSamples/internval-4b.py", line 208, in load_video
    vr = VideoReader(video_path, ctx=cpu(0), num_threads=1)
NameError: name 'VideoReader' is not defined

Besides VideoReader, the cpu also did not import dependencies correctly

捕获

hw446 commented 2 months ago

pip install decord

from decord import VideoReader, cpu

zjykzj commented 2 months ago

pip install decord

from decord import VideoReader, cpu

@hw446 @lvhan028 @shepnerd @whai362 Example program from: https://modelscope.cn/models/OpenGVLab/InternVL2-4B The Quick Start section. Perhaps we can add the following dependency import statement

import numpy as np
from decord import VideoReader, cpu