danieljf24 / dual_encoding

[CVPR2019] Dual Encoding for Zero-Example Video Retrieval
Apache License 2.0
154 stars 31 forks source link

Seeking information for CNN model used on MSRVTT #7

Closed e0lithic closed 5 years ago

e0lithic commented 5 years ago

Hi,

First of all great work. I have managed to reproduce your results. However could you please provide additional information on the model you have utilized for video feature extraction . I tried experimenting with features extracted using the torchvision Resnet-152 model (pre trained weights). However, they didn't performed particularly well using the trained model you have provided for Dual Encoding.

I assume since you have trained your model using features from a particular Resnet model the dual encoding is biased towards it . In order to achieve a good result using your trained model, the same CNN model needs to be utilized for the feature extraction.

So could you please give more information about the particular variant of the Resnet-152 model you utilized.

Thanks