Hi there, thanks so much for the interesting work!
I'm a bit confused on how to reproduce the precise attention map visualizations in the AVT paper, showing which frames attend strongly to which other frames in AVT-h. Is there any code available for reproducing those visualizations?
Hi there, thanks so much for the interesting work!
I'm a bit confused on how to reproduce the precise attention map visualizations in the AVT paper, showing which frames attend strongly to which other frames in AVT-h. Is there any code available for reproducing those visualizations?
Thanks in advance for your help! Vineet