facebookresearch / LaViLa

Code release for "Learning Video Representations from Large Language Models"
MIT License
478 stars 42 forks source link

About Ego4d dataset #37

Closed lwpyh closed 3 months ago

lwpyh commented 3 months ago

Hi there,

Thank you for your excellent work! I would like to inquire whether you used the full Ego4D dataset for training, or did you employ extracted features from the Ego4D dataset instead? Additionally, are there any strategies to avoid downloading such a large dataset?

Best

zhaoyue-zephyrus commented 3 months ago

Hi @lwpyh ,

We use the full Ego4D dataset (videos) for training.

Ego4D has its own license and we suggest following the steps in https://ego4d-data.org/#download to sign the license form and obtain the data. Once the raw videos are downloaded, you may follow the pre-processing steps in https://github.com/facebookresearch/LaViLa/tree/main/datasets#ego4d. The videos after pre-processing are significantly smaller.

Also, we would recommend the practices discussed in our followup work for faster training.

Best, Yue

lwpyh commented 3 months ago

Thanks for the quickly reply, I will read it carefully.

Best