GeWu-Lab / OGM-GE_CVPR2022

The repo for "Balanced Multimodal Learning via On-the-fly Gradient Modulation", CVPR 2022 (ORAL)
MIT License
218 stars 18 forks source link

Preprocessing Datasets #9

Open AvivSham opened 2 years ago

AvivSham commented 2 years ago

Hi All, Thank you for this wonderful repo, I have few questions regarding the preprocessing procedure.

Specifically, I'm trying to preprocess AVE dataset. The raw dataset contains only videos and the dir's tree looks like:

├── AVE
├── Annotations.txt
├── ReadMe.txt
├── testSet.txt
├── trainSet.txt
└── valSet.txt

where AVE contains all mp4 files.

├── AVE
│   ├── ---1_cCGK4M.mp4
│   ├── --12UOziMF0.mp4
│   ├── --5zANFBYzQ.mp4
│   ├── --9O4XZOge4.mp4

Next I tried to run obtain_frames.py but got the following errors: process path(<PATH TO AVE>/AVE/utIXg-Dp2Yg.mp4) is not dir I assumed you might expect each mp4 file to be inside folder of its own. Using the following script I re organized the dataset:

import shutil
from tqdm import tqdm
from pathlib import Path

if __name__ == '__main__':
    ave_path = Path("<PATH TO AVE>/AVE")
    for p in tqdm(ave_path.glob("*.mp4")):
        new_dir = p.parent / p.stem
        new_dir.mkdir(parents=True, exist_ok=True)
        shutil.move(p, new_dir / p.name)

But still re running obtain_frames.py did not work. How do you expect the dataset to be organized?

In addition, the next step would be to extract the spectrograms by quick look at obtain_audio_spectrogram.py it assumes we already have a folder with wav/mp3 files. As mentioned above that is not the case for AVE dataset. How should we handle this? do we need to extract the audio using the following code snip?

from moviepy.editor import VideoFileClip
clip = VideoFileClip(<PATH TO CLIP>)
clip.audio.write_audiofile(<PATH FOR AUDIO FILE)

Is there any additional processing required?

CREMAD


CREMAD dataset contains flv files, do we need to preprocess them to other format?

General note: It would be great if you can re-write the usage section such that it contains per dataset instructions. It will be helpful for users and reduce the number of issues opened.

Thank you in advance, A. @avivnavon

xiaokangpeng commented 2 years ago

hello, @AvivSham Thanks for your detailed questions and advice! We reviewed our preprocessing code and instructions in the readme and find them not so match in some parts. So we updated the pre-processing files and rewrite the pre-processing part in the readme part.

To answer your questions, the dir tree aims to display the division files that are needed in the experiments, the AVE data contains all mp4 files as you said. And we provide new code to process AVE which you can try again. As for the flv files, if you try to use VideoFileClip, you can process them similarly like mp4 files.