clovaai / cutblur

Rethinking Data Augmentation for Image Super-resolution (CVPR 2020)
MIT License
380 stars 62 forks source link

Why is X2 scale pretraining necessary for DIV2K #2

Closed Ir1d closed 4 years ago

Ir1d commented 4 years ago

Hi I was trying to reproduce cutblur and failed because I didn't use X2 scale pretraining. Then I noticed that you mentioned in the README that "To achieve the result in the paper, X2 scale pretraining is necessary".

I'm a bit curious about have you found out why is this necessary?

Thanks in advance.

nmhkahn commented 4 years ago

Hi. Multi-scale training (or X2 pretraining) is a very common strategy in recent SR methods (include models we used such as CARN, EDSR, and RCAN). We also observed that not using X2 pretraining harms the performance of the X4 scale, perhaps it because 1) X2 pretraining provides good initialization 2) it gives additional image pairs. You can find a more detailed explanation in VDSR or EDSR paper.

Ir1d commented 4 years ago

Hi @nmhkahn thanks, but I'm still a bit curious, why is it not neccessary for RealSR dataset, is it because RealSR didn't provide x2 downscaled images :smile:

nmhkahn commented 4 years ago

@Ir1d The reason for using a pretraining in the DIV2K dataset is to match the performance of our modified baseline (see appendix) to the original paper's result for the fair comparison. But since all the backbone networks (CARN, RCAN, EDSR) don't report RealSR results, we didn't use pretraining for simplicity. And I think that obviously, pretraining improves the RealSR as well.

Ir1d commented 4 years ago

Hi @nmhkahn , can you share your supplementary file? I'm interested in section "CutBlur vs. Giving HR inputs during training" but couldn't find out the exact setup of the experiment.

nmhkahn commented 4 years ago

@Ir1d Do you mean detailed hyperparameter settings used in "Giving HR inputs during training" experiment? As in the appendix, we provide HR inputs instead of LR ones with 33% probability.