xxxnell / how-do-vits-work

(ICLR 2022 Spotlight) Official PyTorch implementation of "How Do Vision Transformers Work?"
https://arxiv.org/abs/2202.06709
Apache License 2.0
806 stars 79 forks source link

Frequency Analysis for MoCo-v3 #25

Closed YuanLiuuuuuu closed 2 years ago

YuanLiuuuuuu commented 2 years ago

Hi, thank you for your great work. We take you code to analysis the feature of MoCo-v3 from the frequency perspective, but we obtain the following trend: kl7gTzKilN I am a little bit confused, because I think there should be a decreasing trend.

xxxnell commented 2 years ago

Hi @YuanLiuuuuuu, thank you for reporting the great result. It may be intuitive that the high frequency amplitude of MoCo-v3 may be lower than that of other ViTs and some layers of MoCo-v3 reduce the high-frequency amplitude. Since I am also interested in the behaviours of self-supervised learning methods, could you please send me an e-mail (mailto:namuk.park@gmail.com) or leave your e-mail address here, if you don't mind? Then I'll try to follow up regarding this issue within a few weeks.