Could someone help with pointers/code to visualize the attention maps generated by mobilevit?

Hi @chinhsuanwu / All,

Could you please help with pointers or code to visualize the attention maps generated by mobilevit? While the regular ViT can be rescaled to image size and interpolated, I am finding it hard to figure out to generate visualization for attention map of size torch.Size([1, 4, 4, 16, 16]) for a single image. I understand 16x16 is the attention; and 4 at index 1 is head; but unable wrap my head around 4 at index2 and how it would contribute in plotting the visualization. This is a bit urgent and would really help if anyone has thoughts around this.

Regards, Arun

chinhsuanwu / mobilevit-pytorch

Could someone help with pointers/code to visualize the attention maps generated by mobilevit? #15