Zian-Xu / Swin-MAE

Pytorch implementation of Swin MAE https://arxiv.org/abs/2212.13805
62 stars 12 forks source link

Loss goes horizontally after 200 epoch #11

Open bqm1111 opened 4 months ago

bqm1111 commented 4 months ago

Thank you for your interesting work. I tried to use your method on my custom dataset. The loss goes from 2.2 to 0.2 in less than 200 epochs, then refuses to go down. Do you encounter this problem? What can I do to overcome this?

Zian-Xu commented 4 months ago

Since I'm not sure which dataset you're using, I'm uncertain whether there are issues with the training or if the difficulty of the upstream task itself is causing the loss not to decrease. Perhaps you could also experiment with using MAE to see if the loss can decrease on your own dataset.

bqm1111 commented 4 months ago

I trained on RGB images from SUNRGBD dataset which has over 5000 images for training. I used MAE and the result is the same. It seems like the training process stuck in a local minima. How many epochs did you train, is there anything special about your learning rate scheduler?

Zian-Xu commented 4 months ago

The situation you described, which the model gets stuck in a local optimum, is indeed a possibility, but I can't offer you specific advice on how to address this issue. Typically, I would try different loss functions, optimizers, and so on, but there's no guarantee that the problem will be resolved. Another possible situation is that the upstream task on your dataset itself is inherently difficult, so the loss may not continue to decrease. The configuration I used can be directly found in the open-source project code. I didn't employ any different operations.

bqm1111 commented 4 months ago

How small do you expect your loss to be for a good reconstruction?

Zian-Xu commented 4 months ago

For different datasets, the final loss obtained by Swin MAE is not entirely the same. However, for the two datasets I tried, it was roughly between 0.002 and 0.003. You can see the loss curves for each experiment in the paper.

bqm1111 commented 4 months ago

Now I know what is the problem. I see that you did not use normalization and RandomResizedCrop transform as in original MAE. When I do not use normalization, the loss starts to come close to your report. Do you have any comment on the effect of those transformations?

bqm1111 commented 4 months ago

As mentioned here. If your goal is to reconstruct a good-looking image, use unnormalized pixels. If your goal is to finetune for a downstream recognition task, use normalized pixels. Did you finetune downstream task using normalized or unnormalized pixels?

Zian-Xu commented 4 months ago

MAE does not rely on data augmentation as much as contrastive learning. And I believe that RandomResizedCrop destroys the integrity of medical images. Therefore, RandomResizedCrop is not used in the experiments. In previous experiments, unnormalized pixels have been used. The role played by normalization needs further experimental verification.