How could I organize my data to pretrain and finetune?

FlyEgle / MAE-pytorch

Masked Autoencoders Are Scalable Vision Learners

245 stars 36 forks source link

How could I organize my data to pretrain and finetune? #4

Closed insomniaaac closed 2 years ago

insomniaaac commented 2 years ago

suppose I have many images in a folder, like: 1.jpg 2.jpg ... 9999.jpg

how could I organize these pics to pretrain my model?

FlyEgle commented 2 years ago

you can use the os.listdir to get you folder all images, and write the path with label in txt file.The format is like this, xxx/1.jpg, label xxx/2.jpg, label ... xxx/9999.jpg, label

insomniaaac commented 2 years ago

Thank you for your quick reply. But I am still confused because the model should not need labeled images in the pre-training stage. My dataset is composed of several small datasets, which makes it difficult to extract their labels. So if you mean that pretrain code and finetune code are in one module?

FlyEgle commented 2 years ago

Pretrain model is not need label, but finetune need. In my pretrain stage, i not used the label for supervision model, only use the images.My dataset is used for pretrain and finetune, so my format is the image with label.You can randomly label, such as all is 0, which is not used in the pretrain stage.