Is it possible to use unlabelled data for pretraining?

MIC-DKFZ / nnUNet

Apache License 2.0

5.86k stars 1.75k forks source link

Hi,

I have about 500 labelled images and 1500 unlabelled images. I wonder if it is possible to pretrain a model using those unlabelled images and then use the pretrained weights to initialize the model to train on the labelled dataset.

I followed those steps: https://github.com/Kent0n-Li/nnSAM/blob/main/documentation/pretraining_and_finetuning.md but stuck in the steps (nnUNetv2_extract_fingerprint and nnUNetv2_move_plans_between_datasets) because labels are required ("labels" field in dataset.json file and labelled images in the labelsTr folder.

Is there a way to utilize those unlabelled dataset to improve overall training results on the labelled one? or nnUNet must work only on labelled dataset?

Thanks in advance for your response.

MIC-DKFZ / nnUNet

Is it possible to use unlabelled data for pretraining? #2338