facebookresearch / dinov2

PyTorch code and models for the DINOv2 self-supervised learning method.
Apache License 2.0
8.74k stars 751 forks source link

Loading state dict to finetune model #154

Open surajyakoa opened 1 year ago

surajyakoa commented 1 year ago

Hello!

I'm trying to finetune the vit_b backbone model on my own dataset. I've created my own dataloader and am able to run the training script and I see the total loss decreasing slowly. However, I'm not totally convinced that the preloaded weights are being loaded correctly. For one, I see the starting 'Total Loss' as ~15.6 regardless of whether I load the weights.

pretrained_weights = "./pretrained_models/dinov2_vitb14_pretrain.pth" state_dict = torch.load(pretrained_weights, map_location="cuda") state_dict = {k.replace("module.", ""): v for k, v in state_dict.items()} state_dict = {k.replace("backbone.", ""): v for k, v in state_dict.items()} msg = model.student.backbone.load_state_dict(state_dict, strict=False)

I'm using some code like this to load the weights. Any thoughts on whether I am doing this correctly would be greatly appreciated!

qasfb commented 1 year ago

Are you trying to perform a new dinov2 training using the distilled model as initialization ? I expect it will not work well because the distilled models are not trained for masked image modeling at all.

surajyakoa commented 1 year ago

Hmm, I haven't tried torch.hub.load(). How does this differ from the method I'm using now?

Ah really, the masked-image-training training does not transfer during the distillation? Do you have a proposed method of finetuning a model on a custom dataset that might be successful?

qasfb commented 1 year ago

Actually for training torch hub load will not be enough; for resuming training you would need the prototype heads for a non-distilled model, and we do not provide that at the moment.

If you have labels in your dataset, you can use the model as initialization for your backbone and perform training (but not with dinov2). What are you trying to do that the pretrained backbones do not allow, by the way ?

surajyakoa commented 1 year ago

Thanks for the responses! The dataset does not have labels. I am aiming to finetune the backbone such that the self-supervised features fit the distribution of my data a bit better (although it already does quite well out of the box). I want to use the embeddings as part of a vector/image similarity piece. Is this sort of finetuning possible?

qasfb commented 1 year ago

We haven't explored this kind of finetuning yet so your guess is as good as mine I would say...

MohammedSB commented 1 year ago

My goal is to do that too, and my guess is that it would perform at least better than a from-scratch encoder, so I would suggest you try it.

GZ-YourZY commented 10 months ago

你好!

我正在尝试在我自己的数据集上微调 vit_b 主干模型。我已经创建了自己的数据加载器,并且能够运行训练脚本,并且我看到总损失在缓慢减少。然而,我并不完全相信预加载的重量加载正确。首先,无论我是否加载重量,我都会看到起始“总损失”约为 15.6。

pretrained_weights = "./pretrained_models/dinov2_vitb14_pretrain.pth" state_dict = torch.load(pretrained_weights, map_location="cuda") state_dict = {k.replace("module.", ""): v for k, v in state_dict.items ()} state_dict = {k.replace("backbone.", ""): v for k, v in state_dict.items()} msg = model.student.backbone.load_state_dict(state_dict, strict=False)

我正在使用类似这样的代码来加载权重。任何关于我是否正确执行此操作的想法将不胜感激!

Regarding the task of fine-tuning this model, can I ask if you have made any progress recently, and if you can share how you have worked on fine-tuning dinov2 using your own data

PMRS-lab commented 1 week ago

Thanks for the responses! The dataset does not have labels. I am aiming to finetune the backbone such that the self-supervised features fit the distribution of my data a bit better (although it already does quite well out of the box). I want to use the embeddings as part of a vector/image similarity piece. Is this sort of finetuning possible?

Have you solved the issue you mentioned? Now I have the same problem as you. Fine tune the backbone network,Thank you very much.

PMRS-lab commented 1 week ago

你好! 我正在尝试在我自己的数据集上微调 vit_b 主干模型。我已经创建了自己的数据加载器,并且能够运行训练脚本,并且我看到总损失在缓慢减少。然而,我并不完全相信预加载的重量加载正确。首先,无论我是否加载重量,我都会看到起始“总损失”约为 15.6。 pretrained_weights = "./pretrained_models/dinov2_vitb14_pretrain.pth" state_dict = torch.load(pretrained_weights, map_location="cuda") state_dict = {k.replace("module.", ""): v for k, v in state_dict.items ()} state_dict = {k.replace("backbone.", ""): v for k, v in state_dict.items()} msg = model.student.backbone.load_state_dict(state_dict, strict=False) 我正在使用类似这样的代码来加载权重。任何关于我是否正确执行此操作的想法将不胜感激!

Regarding the task of fine-tuning this model, can I ask if you have made any progress recently, and if you can share how you have worked on fine-tuning dinov2 using your own data

Have you solved the issue you mentioned? Now I have the same problem as you. Fine tune the backbone network,Thank you very much. Written in Beijing, China.