Parameters for replicating Zero-Shot evaluation retrieval results

anikethjr commented 3 years ago

Hello,

Thank you so much for sharing your code and pretrained models. I was trying to replicate your text-video retrieval results on the MSR-VTT dataset. I obtained the pretrained model from here - https://www.rocq.inria.fr/cluster-willow/amiech/howto100m/s3d_howto100m.pth. I ran the command mentioned in the README to perform the evaluation but using a smaller batch size, I didn't change any of the other parameters:

python3 eval_msrvtt.py --batch_size=2  --num_thread_reader=20 --num_windows_test=10 --eval_video_root=path_to_videos --pretrain_cnn_path=path_to_pretrained_model

I get the following results:

R@1: 0.1 - R@5: 0.25 - R@10: 0.35 - Median R: 31.0

These numbers are much lower than the ones mentioned in the table. I am guessing that the evaluation parameters are different since changing the batch size should not affect the results. Could you please tell me what parameter values were used to obtain the results mentioned in the table?

Thank you!

antoine77340 commented 3 years ago

Hi ,

You can find our results in Table 5 on the main paper and we get: R1: 0.099 - R5: 0.24 R10: 0.324 - Median R:29.5 which are quite similar to the results you get overall? (some metrics are better than others)

The results are not exactly the same from the paper since this is a total reimplementation from scratch of the original work that uses internal tool from DeepMind, and hences some small differences in implementation could lead to minor differences in results.

anikethjr commented 3 years ago

Hey,

I didn't realize that the metrics reported by the code are ratios and not percentages as in the paper. Thank you so much for the clarification!!

antoine77340 / MIL-NCE_HowTo100M

Parameters for replicating Zero-Shot evaluation retrieval results #17