boheumd / MA-LMM

(2024CVPR) MA-LMM: Memory-Augmented Large Multimodal Model for Long-Term Video Understanding
https://boheumd.github.io/MA-LMM/
MIT License
178 stars 21 forks source link

LVU, COIN dataset #21

Closed nisargshah1999 closed 2 weeks ago

nisargshah1999 commented 3 weeks ago

Hi, Thanks for interesting work Would it be possible to share the LVU and COIN dataset; Also, any codes to preprocess them as I couldn't find any information on lavis/datasets/download_videoes

Thanks

boheumd commented 3 weeks ago

Hi, you can download LVU and COIN dataset through the link provided in the README file. For the data preprocess, simply extracting video frames at an fps=10 is enough.

boheumd commented 3 weeks ago

Hi, I updated the README.md. The example preprocessing code is provided here https://github.com/boheumd/MA-LMM/blob/main/data/extract_frames.py