Closed SCZwangxiao closed 2 years ago
Hi @SCZwangxiao, The difference in the movie's key stems from the origin of the data.
If you have any other questions please feel free to reach out.
Hi, glad to hear from you. The difference in the movie key stems from the origin of the data. For the training set we use the same identifier as in the audiovault website, while for the val/test split the data originated from the LAMDC dataset so we used the say keys (which include part of the movie name). If you are interested in knowing which are the titles in the training set I can send the list to you or the mapping between the ids and the IMDb ids.
If you have any other questions please let me know.
Best, Mattia
On Thu, 7 Apr 2022 at 17:59, Wang Xiao @.***> wrote:
Hi, the dataset is awosome, and I am surfing on it. I've found that the details of movie key differ in splits. Although you've mentioned that movie key of annotation file is not the title in DataInspection.ipynb, all movie keys in test & val set contain movie title, while all those in train set are only digital ids. Is there any special benefits of the above difference ? I just wondering whether there is my favorite movie in the training set.
— Reply to this email directly, view it on GitHub https://github.com/Soldelli/MAD/issues/2, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGKG44DUIVQXICOSJUIUFJTVD3Z47ANCNFSM5SZVLZUQ . You are receiving this because you are subscribed to this thread.Message ID: @.***>
Hi, the dataset is awosome, and I am surfing on it. I've found that the details of
movie
key differ in splits. Although you've mentioned thatmovie
key of annotation file is not the title inDataInspection.ipynb
, allmovie
keys in test & val set contain movie title, while all those in train set are only digital ids. Is there any special benefits of the above difference ? I just wondering whether there is my favorite movie in the training set.