How to train on my own data

DruncBread commented 4 weeks ago

Hi, I try your pretrained model on my dataset, but the output feature from the model is not quite equivariant. I guess it's because my data images are spot matrices and they are quite different from the images in ImageNet-1k? So I decide to refine the pretrained model on my own data. But I had difficulties understanding the code. Could you explain the training process to me? My datasets is sets of images of spot matrix with different rotation, I hope the corresponding images with different rotation could get similar feature through the model. To start the training, how do I organize the training data? should there be corresponding label of images? And there are multiple rotation angles in my dataset, 10°, 20°, 30°... which pretrained model should I use? c8resnet ?

Best regards

dmklee commented 4 weeks ago

What do you mean by the output feature not being "quite equivariant"? You should expect a small but non-zero equivariance error on the output due to downsampling. How are you testing the equivariance?

This repo uses Pytorch Lightning to perform the training. Switching to a new dataset is relatively straightforward:

Implement your own LightningDataModule (see this guide). The current ImageNetDataModule (line 17 of train.py) uses pytorch's ImageFolder, which you might be able to adapt for your own case.
Modify line 188 of train.py to instantiate your custom data module.
Run the training script as usual: python train.py ...

In terms of what model to use, I would recommend the $C_8$ version if your images can be rotated by increments of 10 degrees. The dihedral versions ($D_1$, $D_4$) should only be used if your data is symmetric to reflections. Let me know if you have more questions.

DruncBread commented 4 weeks ago

Hi, Thank you so much for your response! Please forgive my lack of knowledge in computer vision, this is my first time training a cv model. I still have few questions.

I might have misunderstanding about the output. I thought the output of pretrained model is the feature of images. Is that correct ? This is how I use the model The ouput of images (same content, different rotation) are not similar, so when I use the model's output to perform downstream analysis, the result was bad.
I’m still a bit unclear about the dataloader.

To train the model, do I need to generate images (different rotation) myself ? And maybe organize the dataset dir like this?

data_dir/ ├── train/ │ ├── content1/ │ │ ├── img1_0deg.png │ │ ├── img1_45deg.png │ │ ├── img1_90deg.png │ │ ├── img2_0deg.png │ │ ├── img2_45deg.png │ │ └── img2_90deg.png │ ├── content/ │ │ ├── img3_0deg.png │ │ ├── img3_45deg.png │ │ ├── img3_90deg.png │ │ ├── img4_0deg.png │ │ ├── img4_45deg.png │ │ └── img4_90deg.png │ └── ... ├── val/ ├── content1/ │ ├── img5_0deg.png │ ├── img5_45deg.png │ ├── img5_90deg.png │ └── ... ├── content2/ ├── img6_0deg.png ├── img6_45deg.png ├── img6_90deg.png └── ...

or do I just put all my images under the 'train' and 'val' dir and set the param rot_data to True so the module will generate rotated image itself for training?

Best regards

dmklee commented 3 weeks ago

You are using the model correctly. Are you rotating the image by multiples of 90 degrees? The features tensor should be invariant to discrete rotations. If you rotate the input by a continuous amount (say 23 degrees), there will be noticeably more invariance error. I cannot say much more without seeing some example images and the code you are using to measure invariance error.
You should just put the images under the train and val directories and set rot_param to True. This is a much simpler approach and ensures that the model is trained on images at arbitrary rotations

DruncBread commented 2 weeks ago

Thank you so much for your prompt and helpful advice. It really helped with my project. I appreciate your support. Thanks again!

dmklee / equivision

How to train on my own data #3