microsoft / XPretrain

Multi-modality pre-training
Other
471 stars 37 forks source link

Long Video Processing in LF-VILA #8

Closed vateye closed 1 year ago

vateye commented 1 year ago

Hi, I am wondering how to read the long video and extract the frame efficiently as stated in the LF-VILA paper.

ycsun1972 commented 1 year ago

Hi, we use decord to read the videos, and we compress the resolution and fps to improve efficiency. You can refer to our code of the dataset part for more detail.

vateye commented 1 year ago

Thanks for your reply. And I am wondering when will the code and the dataset be available?

ycsun1972 commented 1 year ago

We are going through the code review process of the company. We will release the code after the review process. Thanks for your patience.