Soldelli / MAD

MAD: A Scalable Dataset for Language Grounding in Videos from Movie Audio Descriptions
MIT License
149 stars 3 forks source link

Question on the ``movie`` key in annotation. #2

Closed SCZwangxiao closed 2 years ago

SCZwangxiao commented 2 years ago

Hi, the dataset is awosome, and I am surfing on it. I've found that the details of movie key differ in splits. Although you've mentioned that movie key of annotation file is not the title in DataInspection.ipynb, all movie keys in test & val set contain movie title, while all those in train set are only digital ids. Is there any special benefits of the above difference ? I just wondering whether there is my favorite movie in the training set.

Soldelli commented 2 years ago

Hi @SCZwangxiao, The difference in the movie's key stems from the origin of the data.

If you have any other questions please feel free to reach out.

Soldelli commented 2 years ago

Hi, glad to hear from you. The difference in the movie key stems from the origin of the data. For the training set we use the same identifier as in the audiovault website, while for the val/test split the data originated from the LAMDC dataset so we used the say keys (which include part of the movie name). If you are interested in knowing which are the titles in the training set I can send the list to you or the mapping between the ids and the IMDb ids.

If you have any other questions please let me know.

Best, Mattia

On Thu, 7 Apr 2022 at 17:59, Wang Xiao @.***> wrote:

Hi, the dataset is awosome, and I am surfing on it. I've found that the details of movie key differ in splits. Although you've mentioned that movie key of annotation file is not the title in DataInspection.ipynb, all movie keys in test & val set contain movie title, while all those in train set are only digital ids. Is there any special benefits of the above difference ? I just wondering whether there is my favorite movie in the training set.

— Reply to this email directly, view it on GitHub https://github.com/Soldelli/MAD/issues/2, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGKG44DUIVQXICOSJUIUFJTVD3Z47ANCNFSM5SZVLZUQ . You are receiving this because you are subscribed to this thread.Message ID: @.***>