Ziyang412 / UCoFiA

Pytorch Code for "Unified Coarse-to-Fine Alignment for Video-Text Retrieval" (ICCV 2023)
https://arxiv.org/abs/2309.10091
MIT License
61 stars 0 forks source link

Which line is the implementation of Interactive Similarity Attention (ISA)? #5

Closed longmalongma closed 5 months ago

longmalongma commented 6 months ago

Hi, Which line is the implementation of Interactive Similarity Attention (ISA)?

longmalongma commented 6 months ago

image In Interactive Similarity Attention (ISA) module, What if there is an ever-growing t (increasing number of video frames) dimension?

Ziyang412 commented 5 months ago

Hi, sorry for not explicitly mention in the code.

For ISA, please check line 352-353 https://github.com/Ziyang412/UCoFiA/blob/517f838483af544304482bc70ee7ff4886d3dfc6/train/modules/modeling_ucofia.py#L352

For Bi-ISA, please check line 360-368 https://github.com/Ziyang412/UCoFiA/blob/517f838483af544304482bc70ee7ff4886d3dfc6/train/modules/modeling_ucofia.py#L360

I think the problem of " ever-growing t" is out of scope of our work, but it's an meaningful future direction to work on, thanks!