Add 1 paper about video-text retrieval

Hi,

Thank you for your wonderful survey!

Would you mind adding 2 papers about video-text retrieval.

Paper 1: Meta-optimized Angular Margin Contrastive Framework for Video-Language Representation Learning

Accepted at ECCV 2024.

It leverages LLaVA to increase the scale of training data to video-text retrieval. The approach is to forward the concatenated frames of a video to LLaVA to generate the caption for the video.

Paper link: https://arxiv.org/abs/2407.03788

Code link: https://github.com/nguyentthong/meta_optimized_angular_margin_contrastive_lvlm

Paper 2: Video-Language Understanding: A Survey from Model Architecture, Model Training, and Data Perspectives

Accepted at ACL 2024 as Findings.

This paper summarizes video-text retrieval methods from model architecture, model training, and data perspectives.

Paper link: https://arxiv.org/abs/2406.05615

Code link: https://github.com/nguyentthong/video-language-understanding

Thanks a lot!

danieljf24 / awesome-video-text-retrieval

Add 1 paper about video-text retrieval #8