haoqiwang / sinder

SINDER: Repairing the Singular Defects of DINOv2 (ECCV 2024 Oral)
30 stars 1 forks source link

Pre-training Weights of ViT-Small/Base/Large #2

Open guojiajeremy opened 2 months ago

guojiajeremy commented 2 months ago

Thank you for your GREAT work.

The default ViT size of paper and released weights is ViT-g, which is too large for researchers and users with limited resourses. Could you please release the weights of smaller versions of ViT, such as ViT-Small/Base/Large?

hiyyg commented 1 month ago

+1

hiyyg commented 1 month ago

@haoqiwang

wangyi111 commented 1 month ago

Just came across this paper, looks very interesting. From what DINOv2 reports, the outlier tokens only show up with a sufficiently big model after sufficiently long training. I guess it wouldn't make much sense with ViT-S/B versions of DINOv2, but ViT-L or ViT-B from supervised or openCLIP could be interesting to see. Also, DINOv2 officially seems to have only released ViT-G trained from scratch, their ViT-L/B/S are distilled from ViT-G. I don't know if there's still outliers in those distilled models.