XPixelGroup / HAT

CVPR2023 - Activating More Pixels in Image Super-Resolution Transformer Arxiv - HAT: Hybrid Attention Transformer for Image Restoration
Apache License 2.0
1.16k stars 136 forks source link

Would it yield better results if we finetune a larger model with more data? #73

Open Alidaling opened 1 year ago

Alidaling commented 1 year ago

Thank you for your outstanding work. I have two questions that I would like to ask you for your guidance: (1) Is the training data for the Real_hat_GAN_SRx4.pth model be generated using "Generate degraded images on the fly"? (https://github.com/xinntao/Real-ESRGAN/blob/master/docs/Training.md#Generate-degraded-images-on-the-fly) (2) As far as I understand, the dataset used for training is the same as that of real-esrgan (DIV2K + Flickr2K + OST). If we finetune a larger model (HAT-L_SRx4_ImageNet-pretrain.pth) with more data (as used in the SwinIR-L of the swinir project,https://github.com/JingyunLiang/SwinIR), the performance should be better than that of the Real_hat_GAN_SRx4.pth model, right? Looking forward to your response.

chxy95 commented 1 year ago
  1. Yes.
  2. Probably yes.