Closed KaiyueSun98 closed 5 months ago
Hi, thanks for your great work, I would like to confirm which UMT model (finetuned stage) you used to calculate the UMTScore? Video-text retrieval or VQA
Thanks for your attention to our work! We adopt the version fine-tuned on video-text retrieval.
Many thanks!
Hi, thanks for your great work, I would like to confirm which UMT model (finetuned stage) you used to calculate the UMTScore? Video-text retrieval or VQA