boheumd / MA-LMM

(2024CVPR) MA-LMM: Memory-Augmented Large Multimodal Model for Long-Term Video Understanding
https://boheumd.github.io/MA-LMM/
MIT License
247 stars 27 forks source link

MSRVTT data process what 10fps meaning? #3

Closed dbcSep03 closed 7 months ago

dbcSep03 commented 7 months ago

image

Can you give processed datasets? The dataset which I process has many problems Thank you!!!

dbcSep03 commented 7 months ago

In video 465, I have 259 frames The process code is

import cv2
import os

def extract_frames(video_path, output_dir, fps=10):
    # Ensure output directory exists
    if not os.path.exists(output_dir):
        os.makedirs(output_dir)

    # Capture video from file
    cap = cv2.VideoCapture(video_path)
    video_fps = cap.get(cv2.CAP_PROP_FPS)  # Get the original video's FPS

    frame_count = int(cap.get(cv2.CAP_PROP_FRAME_COUNT))
    print(f"Total frames: {frame_count}, Video FPS: {video_fps}")

    # If video fps is lower than the target fps, extract every frame
    if video_fps <= fps:
        frame_interval = 1
    else:
        frame_interval = int(video_fps / fps)

    count = 0
    frame_id = 0

    while True:
        success, frame = cap.read()
        if not success:
            break

        # Check if this frame needs to be saved
        if count % frame_interval == 0:
            frame_filename = f"frame{frame_id:06d}.jpg"
            frame_path = os.path.join(output_dir, frame_filename)
            cv2.imwrite(frame_path, frame)
            frame_id += 1
            print(f"Saved {frame_path}")

        count += 1

    cap.release()
    print("Done extracting frames.")

if __name__ == "__main__":
    index_1 = 0
    dir = '/home/dongbingcheng/MA-LMM/data/msrvtt/videos'
    output = "/home/dongbingcheng/MA-LMM/data/msrvtt/frames"
    name_all = os.listdir('/home/dongbingcheng/MA-LMM/data/msrvtt/videos')
    existed = os.listdir("/share/dongbingcheng/msrvtt/frames")
    for index, name in enumerate(name_all):
        if name.split(".")[0] in existed:
            index_1 += 1
            print(index_1)
            print(name.split(".")[0])
            continue
        extract_frames(f'{dir}/{name}', f'{output}/{name.split(".")[0]}')
boheumd commented 7 months ago

Hello! I have reviewed my preprocessing code for extracting video frames and paste it here for your reference. I would like to point out that due to differences in the libraries used for preprocessing, the resulting number of frames extracted from a video might vary. And another alternative way is to update the "frame_length" to your actual extracted frame length for each video in the annotation file.

def extract_frames(video_name):
    video_id = video_name.split('.')[0]
    os.makedirs("{}/{}".format(dst_base_dir, video_id), exist_ok=True)
    cmd = 'ffmpeg -i \"{}/{}\" -vf scale=-1:256 -pix_fmt yuvj422p -q:v 1 -r {} -y \"{}/{}/frame%06d.jpg\"'.format(
        src_base_dir, video_name, fps, dst_base_dir, video_id, fps)
    os.system(cmd)
dbcSep03 commented 7 months ago

Thanks for your answer!