catherinesyeh / attention-viz

Visualizing query-key interactions in language + vision transformers
http://attentionviz.com/
MIT License
122 stars 15 forks source link

2D colormap for positional embedding of VIT #50

Closed yc015 closed 1 year ago

yc015 commented 1 year ago

Find and add a 2D colormap for representing the positional embeddings of image patches in VIT.

Alternatively add two separate coloring schemes: one represents the position of image tokens on the y-axis of the images (rows), and one represents the position of image tokens on the x-axis (columns).

yc015 commented 1 year ago

Not 2D but solved this using 1D colormap: one for row and one for column.