using ViT backbone with PAWS

facebookresearch / suncet

Code to reproduce the results in the FAIR research papers "Semi-Supervised Learning of Visual Features by Non-Parametrically Predicting View Assignments with Support Samples" https://arxiv.org/abs/2104.13963 and "Supervision Accelerates Pre-training in Contrastive Semi-Supervised Learning of Visual Representations" https://arxiv.org/abs/2006.10803

MIT License

486 stars 67 forks source link

@islam-nassar

Yes we tried with a ViT backbone! In short it worked out-of-the-box with the following setup (similar to DINO):

model: ViT-S/16
batch-size: 1024
support-set: 6720 (960 classes, 7 imgs/class)
temperature: 0.1
sharpening: 0.25
me-max regularization: true
starting LR: 2.0e-4
LR: 1.0e-3
final LR: 1.0e-6
start WD: 0.04
final WD*: 0.4
projection head (same as RN50, but with GELU activations and the following dimensions): [256, 256, 256]
prediction head (same as RN50, but with GELU activations and the following dimensions): [256, 256]
optimizer: AdamW

Evaluation: soft NN 10% labels (no fine-tuning):

100 epochs of pre-training: 70.9% top-1 on IN1k
300 epochs of pre-training: 72.3% top-1 on IN1k

*Although you can probably just use a constant WD value, i'm not sure the increasing schedule was that important in this experiment.

Let me know if there's some other information about the setup you need that I forgot to mention!

facebookresearch / suncet

using ViT backbone with PAWS #26