Closed SevenLJY closed 4 years ago
I think the reason must be in the dataset and in the way you obtain LR-HR training pairs. 1) Are your HR images of high enough quality? 2) How do you obtain corresponding LR training images? 3) Are the test images from the same distribution as your LR training images? or you test the model on real LR images?
@SevenLJY I think @yera217 have given good points. Did your training data distribution align well with your testing data? (In other words, did they follow the same downsampling process?)
@yera217 @xinntao Thank you so much for your quick reply! For those questions from @yera217
All of my LR images are obtained from downsampling with bicubic interpolation. @xinntao
I found that the performance on my validation set is pretty good. My test results are not as good as the pre-trained model. So I am thinking that I might impose a negative effect on the generalization ability of the model. Does it make sense? If it is the case, do you have any good suggestions on how to maintain the generalization ability of the model when fine-tuning? Or any suggestions on how to avoid the blurring effect?
Many Thanks
@SevenLJY Hi, So, are your test images are actually from the same batch as your HR training images? Can you upload a sample of your HR training image and test LR image here? Also, I would suggest to apply Gaussian blur of sigma=1.5 or 2.0 and kernel_size=5 on HR training images before down-sampling. It will make your model to better generalize for real image degradation.
@yera217 Hi, Here is my sample HR training data. The test LR image is just the whole face image which is not really a natural human face image but more like a texture map. Unfortunately, it is not very convenient for me to post it here.
You mean it better to blur the HR training data and then downsample? Could you please explain a little bit why blurring the HR image will help generalization?
I read it in some SR papers, that Gaussian blur works better for real SR, and it actually works for me. Regarding your images: what is the resolution of test images? Your test images should be similar to LR training images for it to work.
Thank you so much for your suggestions! I will try Gaussian blurring in my dataset.
My test data is 256 * 256. Actually, I intended to do two tasks. One is 256->1K, the other one is 1K -> 4K. Do I need to train two networks?
So, if you want to do 256->1K SR, then you better train your model on high-res 1K images as HR and its corresponding down-sampled LR 256 images. The same goes for 1K -> 4K. Do you train on HR 1K images or 4K images?
@SevenLJY This is a blind SR. The core lies that your downsampling process should be as close as the real-world images. You can try use Gaussian kernels with different sigma as @yera217 suggests.
Also, you can read papers of blind super-resolution, such as SRMD, IKC, etc.
@yera217 You're right! I shouldn't test the model with 256-res images if I trained on 4K images. My fine-tuned model actually works well on 1K images. I really appreciated your help. It means a lot!
@xinntao Thank you so much for your explanation! I will look into it.
Hi,
I am working on fine-tuning the ESRGAN network with my own dataset. I tried to use learning rate=1e-4, 1e-5, 1e-6 for the generator and discriminator (for about 20000 iterations), respectively. But all of the results from the LR test image have an unexpected blurring effect. Do you guys have any suggestions on the intuition of how to adjust the corresponding params? Like where is a good start to try the learning rate or any other useful params?
Here are my loss curve from two of my experiments. lr=1e-6: lr=1e-5