google-research / big_vision

Official codebase used to develop Vision Transformer, SigLIP, MLP-Mixer, LiT and more.
Apache License 2.0
2.16k stars 147 forks source link

questions about t-SNE visualization in FlexiViT #43

Closed ShipengFu closed 9 months ago

ShipengFu commented 11 months ago

Hi, FlexiViT is a very inspirational idea.

However, I'm kind of stuck at the t-SNE visualization in Fig. 6 of the paper.

Does t-SNE employ the arccosine-transformed CKA as the precomputed metrics ?

If so, how do we calculate the CKA similarity? Is it between a FlexiViT at different patch sizes and a standard ViT at a fixed patch size?

lucasb-eyer commented 11 months ago

@simonster implemented this, personally I don't remember the details for sure, but IIRC, it's only FlexiViT, and shows CKA between the layers of FlexiViT executed at different patch-sizes.