The official implementation of the Improving Interpretable Embeddings for Ad-hoc Video Search with Generative Captions and Multi-word Concept Bank model published in ICMR2024.
Your paper is very inspiring. I would like to reproduce the results in the paper, especially on the TREC-related datasets. However, there are many resources available for download on the homepage, which makes me confused. Can you provide a file tree and inference-only script for reference?
Like this:
$ tree data
data
└── Improved-ITV
├── AVS_data
├── AVS_feature_data/
├── Improved_ITV_trained_models/
└── tgif-msrvtt10k-VATEX/
Your paper is very inspiring. I would like to reproduce the results in the paper, especially on the TREC-related datasets. However, there are many resources available for download on the homepage, which makes me confused. Can you provide a file tree and inference-only script for reference? Like this: