albanie / collaborative-experts

Video embeddings for retrieval with natural language queries
https://www.robots.ox.ac.uk/~vgg/research/collaborative-experts/
Apache License 2.0
332 stars 55 forks source link

About MSVD #5

Closed sharontaozi closed 4 years ago

sharontaozi commented 4 years ago

In dataset MSVD, the number of text descriptions to each video is different. How do you deal with this problem in your experiment?

albanie commented 4 years ago

During training, we sample one description from the collection for each forward pass.

albanie commented 4 years ago

@sharontaozi, closing for now (but feel free to re-open if it's still unclear).

xixiareone commented 4 years ago

The article mentions that "where they randomly chose 5 ground-truth sentences per video. We use the same setting when we compare with that approach".Does the training set, validation set and test set all take 5 sentences at random? Not all sentences are used in training set, validation set and test set?