danieljf24 / awesome-video-text-retrieval

A curated list of deep learning resources for video-text retrieval.
592 stars 67 forks source link

Hello. Introduce a paper #6

Open wjun0830 opened 1 year ago

wjun0830 commented 1 year ago

Hello. We'd like to introduce our paper "Query-Dependent Video Representation for Moment Retrieval and Highlight Detection (CVPR 2023 Paper)" regarding cross-modal moment retrieval.

Code : https://github.com/wjun0830/QD-DETR Arxiv : https://arxiv.org/abs/2303.13874

nguyentthong commented 2 months ago

Hi,

Thank you for your wonderful survey!

Would you mind adding 2 papers about video-text retrieval.

Paper 1: Meta-optimized Angular Margin Contrastive Framework for Video-Language Representation Learning

Accepted at ECCV 2024.

It leverages LLaVA to increase the scale of training data to video-text retrieval. The approach is to forward the concatenated frames of a video to LLaVA to generate the caption for the video.

Paper link: https://arxiv.org/abs/2407.03788

Code link: https://github.com/nguyentthong/meta_optimized_angular_margin_contrastive_lvlm

Paper 2: Video-Language Understanding: A Survey from Model Architecture, Model Training, and Data Perspectives

Accepted at ACL 2024 as Findings.

This paper summarizes video-text retrieval methods from model architecture, model training, and data perspectives.

Paper link: https://arxiv.org/abs/2406.05615

Code link: https://github.com/nguyentthong/video-language-understanding

Thanks a lot!