Attention Rollout for Vision Transformers

samiraabnar / attention_flow

208 stars 27 forks source link

Attention Rollout for Vision Transformers #3

Open jacobgil opened 3 years ago

jacobgil commented 3 years ago

Hi, Not an actual issue, just wanted to share that I implemented your technique for Vision Transformers. https://github.com/jacobgil/vit-explain This includes some tweaks to get this to work for images (dropping the lowest attentions, and fusing the attention heads with max instead of mean). I also added an option for this to be class specific, by weighting the attention with the class gradient (and masking out negative gradients).

samiraabnar commented 3 years ago

Hi Jacob... Awesome! Thanks for letting me know about this! And nice blog post 👍 I added a link to the vit-explain repo in the readme.