Soldelli / MAD

MAD: A Scalable Dataset for Language Grounding in Videos from Movie Audio Descriptions
MIT License
147 stars 3 forks source link

video data problem #4

Closed wuyou111 closed 1 year ago

wuyou111 commented 1 year ago

Hi,thanks for the great work,which helps me a lot.And I have signed the NDA and downloaded the dataset you provided. But I found that the video data is given in the form of feature vectors. I'm wondering if I could use a model other than CLIP to extract the feature vectors of the video, such as C3D. So, I would like to ask you, can you provide me with the original video data.I want to use other(C3D) models to extract features in my future paper, which will be more convenient and flexible for me. I solemnly promise that the data received will only be used for my own academic purposes, and will never be disseminated or used for commercial purposes. Looking forward to your reply :D

Soldelli commented 1 year ago

Dear @wuyou111 kindly provide us with the code for C3D features extraction and a requirements file for the environment and we will get back to you with the features you requested.

wuyou111 commented 1 year ago

Dear @Soldelli I have reach you via E-mail, and this problem has been solved very well :D