LTH14 / mage

A PyTorch implementation of MAGE: MAsked Generative Encoder to Unify Representation Learning and Image Synthesis
MIT License
507 stars 26 forks source link

The specified path cannot be found #41

Open zhanglaoban-kk opened 1 year ago

zhanglaoban-kk commented 1 year ago

Hello, I want to load my own unlabeled image data for pre-training, modify the parser.add_argument in the main_pretrain.py('--data_path', default='./data/imagenet', type=str, help='dataset path'), why the system prompts that the specified path cannot be found. Should the image data be divided according to the training set, validation set, and test set?

LTH14 commented 1 year ago

It should follow the same structure as ImageNet data, with train/class_name/images.png and val/class_name/images.png

zhanglaoban-kk commented 1 year ago

Hello, if I want to use mage for self-supervised learning, pre-train with unlabeled image data, and then load pre-trained weights for labeled data image classification, what should I do?

LTH14 commented 12 months ago

My suggestion is to replace the default ImageNet dataloader with your own dataloader. Once that is done, you can use the unlabeled image data with main_pretrain.py and use the labeled data with main_finetune.py

zhanglaoban-kk commented 10 months ago

Hello, I also have a question, you used the labeled imagenet dataset for pre-training, and then finetune, is there no data leakage in this, because during the pre-training, you use labeled data?

LTH14 commented 10 months ago

We only use the ImageNet images and never use the label information during pre-training.

zhanglaoban-kk commented 10 months ago

Is it pre-trained using only the images in the training and validation sets of the imagenet image dataset? What is the purpose of a validation set?

LTH14 commented 10 months ago

It only uses training set.

zhanglaoban-kk commented 10 months ago

As you said above, if I want to use my own unlabeled image data for pre-training, your suggestion is to replace the dataloader of imagenet, where should I modify the dataloader?

LTH14 commented 10 months ago

Change the dataset here to your customized dataset implementation https://github.com/LTH14/mage/blob/main/main_pretrain.py#L122

zhanglaoban-kk commented 10 months ago

Ok, thanks for the reply, I already understand