What method did you use to visualize the attention in Fig.1(a) and Fig.5?

knightyxp / DGL

[AAAI 2024] DGL: Dynamic Global-Local Prompt Tuning for Text-Video Retrieval. Also, visualization and qb norm search for best performance will be updated ASAP.

Other

29 stars 0 forks source link

What method did you use to visualize the attention in Fig.1(a) and Fig.5? #3

Closed jianghaojun closed 1 month ago

knightyxp commented 1 month ago

In CLIP4Clip visualization, we extract the [CLS] token weights to frame patches for each frame. In DGL, the global prompts are attended to all patches in all frames, allowing us to display the dynamic attention weight across all frame tokens. The visualization code is currently in another branch. I will integrate it into the main branch soon if you need it.

jianghaojun commented 1 month ago

It would be of great help to me if you could provide the code, thanks a lot!

knightyxp commented 1 month ago

Hi haojun, I am currently too busy with a new project. Can you give me your email so I can send you the zip of the visualization branch code? I will merge the visualization code to the main branch when I am free.

jianghaojun commented 1 month ago

Thanks a lot!

Here is my email jianghaojunthu@163.com.

knightyxp commented 1 month ago

Send it. I will close this issue~

zef1611 commented 1 month ago

Hi can I also ask for the visualization code @jianghaojun @knightyxp, this would be a great help for my ongoing work. Here is my email: lehuy2316@gmail.com. Thank you a lot and hope you have a nice day, cheers :D