facebookresearch / dinov2

PyTorch code and models for the DINOv2 self-supervised learning method.
Apache License 2.0
8.74k stars 751 forks source link

fine tune #162

Open wendy0527 opened 1 year ago

wendy0527 commented 1 year ago

my purpose is image retrieval and I want to fine-tune on my own dataset. Should I connect a head module and finetune some layers of the pretrained backbone and head module? Or should I load the pretrained backbone and continue training for several epochs?

MohammedSB commented 1 year ago

Generally, this depends on the size of your dataset. If you have a huge dataset you could finetune the entire model (or at least a big portion of it), but if you have a medium-small sized dataset, you should probably just finetune the added head.

wendy0527 commented 1 year ago

ok, thank you so much~

qasfb commented 1 year ago

The pretrained backbones obtained with distillation have not done the task of masked-image-modeling, so my expectation is that they would not be particularly useful as initializations for further dinov2 training...

If the compute you have allows it, I would recommend training a smaller vit-L model from scratch using the provided config file, on your dataset.