ShirAmir / dino-vit-features

Official implementation for the paper "Deep ViT Features as Dense Visual Descriptors".
https://dino-vit-features.github.io
MIT License
383 stars 44 forks source link

The choice of head_idx #1

Closed kwea123 closed 2 years ago

kwea123 commented 2 years ago

https://github.com/ShirAmir/dino-vit-features/blob/79c1289b5a83960b85ca8e268bc569f48975fddb/extractor.py#L313

Is there any reason why you choose these heads?

ShirAmir commented 2 years ago

We found that heads 1 and 3 attend to background information in ViT-S, so we decided to disregard them for saliency map creation. The method still works comparably well when using all heads, at the cost of adjusting some parameters.