Closed nikky4D closed 2 years ago
Hi, you can use this code to visualize the gradcam on cross-attention maps for BLIP:
https://github.com/salesforce/ALBEF/blob/main/visualization.ipynb
Thank you. I'll take a look.
Is there any document to explain how to convert attention map to image position?
Hi, I just wonder did you manage to get the cross-attention maps for BLIP? If so, could you please share your code with us?
I have the same question about visualizations.
Is there a way to visualize what/where blip focuses in an image when given an input text? similar to grad cam for visualizing weights