Open wendy0527 opened 1 year ago
Generally, this depends on the size of your dataset. If you have a huge dataset you could finetune the entire model (or at least a big portion of it), but if you have a medium-small sized dataset, you should probably just finetune the added head.
ok, thank you so much~
The pretrained backbones obtained with distillation have not done the task of masked-image-modeling, so my expectation is that they would not be particularly useful as initializations for further dinov2 training...
If the compute you have allows it, I would recommend training a smaller vit-L model from scratch using the provided config file, on your dataset.
my purpose is image retrieval and I want to fine-tune on my own dataset. Should I connect a head module and finetune some layers of the pretrained backbone and head module? Or should I load the pretrained backbone and continue training for several epochs?