keyu-tian / SparK

[ICLR'23 Spotlight🔥] The first successful BERT/MAE-style pretraining on any convolutional network; Pytorch impl. of "Designing BERT for Convolutional Networks: Sparse and Hierarchical Masked Modeling"
https://arxiv.org/abs/2301.03580
MIT License
1.41k stars 82 forks source link

how to finetune resnet in my own dataset? #35

Closed merlinxu-zz closed 11 months ago

merlinxu-zz commented 1 year ago

hi~I find I can pre-train resnet in my own dataset by replacing the function [build_dataset_to_pretrain] and running /pretrain/main.py ; also I can finetune resnet in Imagenet by running /downstream-imagenet/main.py; but how can I finetune resnet in my own dataset? when I try to run /pretrain/main.py with args --resume_from=/mypath-to-res50_withdecoder_1kpretrained_spark_style.pth, I will get a error like File "/workspace/user_code/SparK/pretrain/utils/misc.py", line 180, in load_checkpoint missing, unexpected = model_without_ddp.load_state_dict(checkpoint['module'], strict=False) KeyError: 'module'

keyu-tian commented 1 year ago

We use two different, independent codebases for pretraining and ImageNet finetuning. They are in /pretrain and /downstream_imagenet, respectively.

  1. for finetuning resnet in your own dataset: you can do the similar thing in /downstream_imagenet codebase like what we do in the /pretrain codebase.

  2. if you want to finetune from res50_withdecoder_1kpretrained_spark_style.pth, you may need to run /downstream_imagenet/main.py instead of our /pretrain/main.py. You can check /downstream_imagenet/README.md

merlinxu-zz commented 1 year ago

thank you so much!! I have another question: when I run /downstream-imagenet/main.py with my own dataset, I find it calculates accuracy by judging whether the classification predicted by the model is consistent with the classification of the target,but what if my dataset is not Imagenet and I don't have their categery labels? should I treat a picture as a category?(By the way, since the codes use the category label, can it call self-supervised learning?

keyu-tian commented 1 year ago

@merlinxu-zz can you explain your task and dataset to me with more details?

merlinxu-zz commented 1 year ago

sure! my dataset is 40w pictures from game scene,but only 2w picture have labels to do object detection task; so I want to use the rest of 38w pictures to do some self-supervised learning task in the resnet50 network because I use resnet50 as a basebone in my object detection task; how should I use these unlable data in Spark, should I put one picture in one folder to make my dataset the same structure as ImageNet?

keyu-tian commented 1 year ago

ok i see, so you want to continue a self-supervised finetuning on your own dataset (38w images).

First i need to explain that our /downstream_imagenet codebase is for supervised finetuning. If you want a self-supervised finetuning based on our pretrained checkpoint, use our /pretrain. Specifically, see the last line of https://github.com/keyu-tian/SparK/tree/main/pretrain#tutorial-for-pretraining-your-own-dataset, use the --init_weight=/path/to/res50_withdecoder_1kpretrained_spark_style.pth to specify our resnet checkpoint.

As for the dataset, you need to define a new Python class for your dataset, to replace our ImageNetDataset. So "put one picture in one folder to make my dataset the same structure as ImageNet" is not required. Just define your own class, where you can directly load data without labels.

You can refer to https://github.com/keyu-tian/SparK/issues/44#issuecomment-1563987249 fore more implementation details.

merlinxu-zz commented 1 year ago

thank you for your kindly answer!! it works now!