ShirAmir / dino-vit-features

Official implementation for the paper "Deep ViT Features as Dense Visual Descriptors".
https://dino-vit-features.github.io
MIT License
373 stars 38 forks source link

DINO vs MAE #10

Open fabienbaradel opened 1 year ago

fabienbaradel commented 1 year ago

Hi, Thanks for your amazing work. The study is very interesting. You are using DINO as feature extractor in your work, and I was just wondering if you tried using MAE or a different method? And do you have the same/similar results? Thanks for your time,

ShirAmir commented 1 year ago

Hi!

Thanks for finding interest in our work! We did similar PCA visualizations for those in the paper with the ViT backbone of CLIP, and got similar results to those of the supervised ViT (relatively noisy maps, last layer first principal components tend to distinguish between classes). We did not apply these visualizations on MAE, feel free to try it out and submit a pull request 🙂