xwen99 / temporal_context_aggregation

Temporal Context Aggregation for Video Retrieval with Contrastive Learning, WACV 2021
https://arxiv.org/abs/2008.01334
Apache License 2.0
27 stars 3 forks source link

Questions about evaluation #2

Closed glee1228 closed 2 years ago

glee1228 commented 3 years ago

Hello, I'm trying to restore the code after reading the paper. I have a few questions about the evaluation. The FeatureDataset class that exists at data.py is called from evaluation.py. This part is missing, can you share it with me? And I want to know exactly what padding size means during the evaluation and how much should I set it to? Thank you.

xwen99 commented 3 years ago

Hi, the code is now updated: 0ba11b2c2fff57d484bf2f788072c75f457c804e About the padding_size, as we want to read a batch of videos with different lengths (i.e., frame number), we need to pad all the videos with blank frames to a fixed length to form a tensor. Do not worry, the blank frames are just ignored during evaluation. To avoid information loss, it is recommended to set the padding size greater or equal to the maximum video length in the dataset (e.g., 300 for FIVR under the setting of 1 fps).