manuelfritsche / real-world-sr

[ICCVW 2019] PyTorch implementation of DSGAN and ESRGAN-FS from the paper "Frequency Separation for Real-World Super-Resolution". This code was the winning solution of the AIM challenge on Real-World Super-Resolution at ICCV 2019
MIT License
162 stars 37 forks source link

issue in DSGAN part about the UNsupervied training #11

Closed JustinAsdz closed 4 years ago

JustinAsdz commented 4 years ago

Hello Manuel,

recently, I am following your work FSSR. While changing the structure of the DSGAN, I got some doubt about the unsupervised training as follows.

image As the architecture dipicted in the paper, the unsupervised part of DSGAN lies in the Discriminator part ---- to tell a generated LR image from a real world LR image.

But When reading the code of train.py, I find that when dealing with any dataset rather than aim2019, the train_set is defined as follows train_set = loader.TrainDataset(PATHS[opt.dataset][opt.artifacts]['hr']['train'], cropped=True, **vars(opt))

And I roughly draw a pipeline of the management of the data image

So, in my mind, it seems that you haven't use the real-world LR image as disc_img in the backward part. so it can't count as unsupervised training?

Is my understanding correct?

Futhermore, If i want to have the real-world LR image envoled in the train_dataset, Can I just change the defination of the dataset the same way the Validation dataset is defined?(output 3 imgs: bicubiced_img, img downscaled by generator and real LR image)

And by the way, I want to assure that the output of utils.imresize(img) is the bicubic downsampled format of img?
And what is the short word disc (in disc_img ) mean? (Just for personal interest.)

Thx

manuelfritsche commented 4 years ago

So, in my mind, it seems that you haven't use the real-world LR image as disc_img in the backward part. so it can't count as unsupervised training?

We made experiments with real-world images from the DPED dataset as well as images with artificial corruptions. Of course the latter are not actual real-world images, but are supposed to simulate real-world conditions.

Our goal is to generate downsampled "HR" images (i.e. "LR" images) with the same characteristics as the original HR images, or alternatively with the characteristics of another set of images. Cropping such an image does not change any of its local characteristics, but simply reduces the number of pixels used. We are applying the cropping to make sure that the real and fake images are of the same size, which balances the training of the discriminator. We are only using the original HR images during training, without assuming any specific downsampling operations. Therefore, the training happens in an unsupervised fashion.

Futhermore, If i want to have the real-world LR image envoled in the train_dataset, Can I just change the defination of the dataset the same way the Validation dataset is defined?(output 3 imgs: bicubiced_img, img downscaled by generator and real LR image)

I am not sure what you mean with this. What do you define as real LR image? If you feed in the bicubically downsampled images as target, then the whole process of learning the downsampling method is useless, because the downsampling method is already known.

And by the way, I want to assure that the output of utils.imresize(img) is the bicubic downsampled format of img?

yes

And what is the short word disc (in disc_img ) mean? (Just for personal interest.)

"discriminator image"

JustinAsdz commented 4 years ago

Thanks For Your Kind Reply With Patience. Now I totally understand the usage of the corruptions(gaussian or jpeg artifacts), which is used to simulate the real world domain and guide the DSGAN network.

Futhermore, If i want to have the real-world LR image envoled in the train_dataset, Can I just change the defination of the dataset the same way the Validation dataset is defined?(output 3 imgs: bicubiced_img, img downscaled by generator and real LR image)

I am not sure what you mean with this. What do you define as real LR image? If you feed in the bicubically downsampled images as target, then the whole process of learning the downsampling method is useless, because the downsampling method is already known.

And for the operation mentioned above, I meant to have real world image envolved in the training process. So in that process, the bicubic-downsampled images is not used for learning the target domain, while is for learning the rough structure of the input image. Thats's what I wanna do.