boheumd / MA-LMM

(2024CVPR) MA-LMM: Memory-Augmented Large Multimodal Model for Long-Term Video Understanding
https://boheumd.github.io/MA-LMM/
MIT License
221 stars 26 forks source link

dataset processing #2

Closed haoranD closed 3 months ago

haoranD commented 5 months ago

Hi,

Thanks for your amazing work.

I can only find the download script for msrvtt and msvd in lavis folder.

Could you share more details about downloading, organizing and preprocessing for other datasets?

Thanks a lot.

Best

boheumd commented 5 months ago

Hello! Currently, the downloading script only supports the MSRVTT and MSVD datasets. To obtain other datasets, please refer to the provided links and download the videos using the official download links associated with each dataset. Afterward, you can preprocess the videos by extracting frames at a frequency of 10 frames per second (fps=10) to prepare the dataset for use. Please note that the downloading script does not yet support other datasets, but I plan to add support for more datasets in the future.