facebookresearch / LaViLa

Code release for "Learning Video Representations from Large Language Models"
MIT License
491 stars 46 forks source link

Preprocessing of Ego4D for pretraining #24

Closed chuyishang closed 1 year ago

chuyishang commented 1 year ago

In the datasets readme, it says that the preprocessing procedure for Ego4D is TBA. I was wondering if you guys could share what processing you did to Ego4D.

chuyishang commented 1 year ago

Judging from the output directory format, is the preprocessing just splitting the videos into 5 minute clips? Or is there more preprocessing that is done? Thanks in advance!

zhaoyue-zephyrus commented 1 year ago

Hi Chuyi,

Judging from the output directory format, is the preprocessing just splitting the videos into 5 minute clips? Or is there more preprocessing that is done? Thanks in advance!

You are mostly right. We split the videos into 5-minute chunks and additionally resize them so that the shorter size is 288 px. Please refer to the updated readme and I've also attached a script for your reference in the latest commit #25 .