Could you explain the detail of 2-stage SR method?

xinntao / EDVR

Winning Solution in NTIRE19 Challenges on Video Restoration and Enhancement (CVPR19 Workshops) - Video Restoration with Enhanced Deformable Convolutional Networks. EDVR has been merged into BasicSR and this repo is a mirror of BasicSR.

https://github.com/xinntao/BasicSR

1.5k stars 316 forks source link

Could you explain the detail of 2-stage SR method? #28

Open izhx opened 5 years ago

izhx commented 5 years ago

It is very easy to understand two-stage deblur, but how can i design a two-stage SR scheme.

xinntao commented 5 years ago

It is similar to two-stage deblur. In the second stage, the input is the super-resolved image of the first stage. We will make our competition models and codes public later so that you can know the details better.

izhx commented 5 years ago

Do you means that if I want to SR 4x, I need to SR 2x first and SR 2x again? emmmmm, that's so simple and direct hhhhh.

I tried to train you remarkable work, but i didn't get a excellent performance. I got 2 kind of bad result: 1) white noise in black area. 2) color shift, the result turned to purple when the scene changed. I used the reflection padding mode, and the replicate mode was used when the sequence doesn't have enough length. But I didn't cut different scenes in long video when train stage, I think the odds that a random selected continuous 7 frames have a scene transition is small.
I suppose the bad result may caused by: 1) different padding mode, 2) bad train. By the way, I'm using YCbCr color space. What's your opinion? Thanks! It's my first time to do the DL-based SR and DL. Please excuse my silly mistakes I may have made.

I'm looking forward to your train code.

noise in black area

color shift

xinntao commented 5 years ago

No, we upsample x4 at the first stage. In the second stage, we directly input the images with high resolution and then refine them.

For the purple outputs, I think there must be some bugs =-= And for the white pixels, may check your codes of saving images. We have not observed such artifacts with zero paddings.

guiji0812 commented 5 years ago

No, we upsample x4 at the first stage. In the second stage, we directly input the images with high resolution and then refine them.

For the purple outputs, I think there must be some bugs =-= And for the white pixels, may check your codes of saving images. We have not observed such artifacts with zero paddings.

when the first stage has finished, save the hr output and then used as the second stage's training set. Is it right that I understand? @xinntao

xinntao commented 5 years ago

@guiji0812, yes, you are right.

guiji0812 commented 5 years ago

@guiji0812, yes, you are right.

Thank you for your reply, I have a confusion for the second stage's training set, does it use all the output of the first stage's training images or just use the output of the first stage's test images? I use the output of test images as the second stage's training set ,but it does not have any improvement, could you give me some advice?Thank you @xinntao

xinntao commented 5 years ago

Use all the output of the first stage's training images.

ShunLiu1992 commented 5 years ago

Hi xinntao: Thank you for your great contribution. Follow your paper and the reply before, We used all the data of the first stage, including the training set and the validation set, to generate the second-stage input. However, when training the second-stage model, we found that, through the validation set, the model only converges to the result of the first-stage SR model. More insightful, the residual of the second-stage closes to 0 after several iterations, which means there is no improvement of the deblurring mechanism.

Parameters of our second-stage model are as follows:

network_G: which_model_G: EDVR nf: 128 nframes: 5 groups: 8 front_RBs: 5 back_RBs: 20 predeblur: true HR_in: true w_TSA: false

Have you ever suffered from such dilemma during your training? Are we missing any important details? Please give us some advice, look forward to your reply.

Best regards, Shun

xinntao commented 5 years ago

@ShunLiu1992 We observed improvements from the second stage (not marginal improvements) 1) During training, we initialize the second-stage model with the first-stage pre-trained model. 2) During the second-stage training, the inputs of the second stage are the outputs of the first stage.

The network structure of second-stage is very similar to the first one. You can find our 1st and 2nd stage models (the network structures and the configurations) in the test scripts.

trillionpowers commented 5 years ago

Hi xintao, Can you explain the training details of the 2-stage?

At the first stage, for the reconstruction module, does we need to train EDVR_woTSA with back_RBs: 20 at first? and then train EDVR_woTSA with back_RBs: 40? Futhermore, do the second stage?
1. I want to know the details. How to do in the 2-stage? Which tricks do you use?
Best regards, Trillion