facebookresearch / dinov2

PyTorch code and models for the DINOv2 self-supervised learning method.
Apache License 2.0
8.36k stars 703 forks source link

Self-supervised fine tuning #92

Open pablotalavante opened 1 year ago

pablotalavante commented 1 year ago

I have compared the instance retrieval performance of my current model vs. dinov2 and my model training on my data performs better than dinov2, although it does quite well. But it looks like the way dinov2 is trained is more stable than my current method.

Do you think it is worthwhile to fine-tune the trained weights of dinov2 on my own data? My guess is that with all that pretraining, it should be fair easy for the model to switch domains, but I have no idea how much time this might take or which hyperparameters to use. Any take on this? Another idea could be just fine-tuning to a downstream task, but I do not have a big amount of labeled data and I think the model would benefit more from more unlabeled data.

I have checked the code and I guess I should load the pre-trained model and replace the data generator with my own in https://github.com/facebookresearch/dinov2/blob/c3c2683a13cde94d4d99f523cf4170384b00c34c/dinov2/train/train.py#L195

Any comments on this issue are welcome!

patricklabatut commented 1 year ago

Do you think it is worthwhile to fine-tune the trained weights of dinov2 on my own data?

I believe we never tried fine-tuning to specific retrieval tasks. We generally expect DINOv2 features to perform well out-of-the-box and not necessitate fine-tuning for solid performance. Of course, with a suitable dataset, fine-tuning could still help increase performance further.

JeavanCode commented 1 year ago

Do you think it is worthwhile to fine-tune the trained weights of dinov2 on my own data?

I believe we never tried fine-tuning to specific retrieval tasks. We generally expect DINOv2 features to perform well out-of-the-box and not necessitate fine-tuning for solid performance. Of course, with a suitable dataset, fine-tuning could still help increase performance further.

I tried to finetune dinov1 base in ImageNet2012,achiving 82.3% top 1 validation accuracy (4% ahead of linear probe), so did beit2 writers(they achieved 0.5% over me). However, when I finetune dinov2 with the same recipy, only achieving 84.8% accuracy, which is only 0.3% leading.Could it be dinov2 have extracted so excellent features that have achieved the upper bound of vit-base or simplly wrong implemmentation?LOL

BenSpex commented 1 year ago

Can you provide an example code for the fine tuning @slowpokeJeavan Our goal is to really fine tune the model on data from plants, allowing the model to better learn different species, leaves, stems, etc. The current model is a really great general purpose model, but not very domain specific. Which it also never aimed to be of course

programmeddeath1 commented 1 year ago

Upvoting nd commenting for any sample reference code for fine-tuning on specific domains.

innat-asj commented 1 year ago

From here

... these visual features are robust and perform well across domains without any requirement for fine-tuning.

I find this argument a bit weird. How does it stand true for task-specific cases or domain-specific problem, which is the real cases to do fine-tuning?

benblack769 commented 1 month ago

Basically everyone in medicine fine-tunes Dinov2 on some custom dataset, because none of these foundation models work out of the box on any medical data, everything needs significant fine-tuning. This repo is an example I am trying to base my efforts off of https://github.com/beneroth13/dinov2