XPixelGroup / BasicSR

Open Source Image and Video Restoration Toolbox for Super-resolution, Denoise, Deblurring, etc. Currently, it includes EDSR, RCAN, SRResNet, SRGAN, ESRGAN, EDVR, BasicVSR, SwinIR, ECBSR, etc. Also support StyleGAN2, DFDNet.
https://basicsr.readthedocs.io/en/latest/
Apache License 2.0
6.83k stars 1.19k forks source link

I get unsatisfying results when reproducing your results! #83

Open hellohawaii opened 5 years ago

hellohawaii commented 5 years ago

I got unsatisfying results when trying to reproduce your results!

The following is what I did: I used the matlab downsample kernel, and I only used the DIV2K as the training dataset . I used the GAN-loss and the RRDB_PSNR_x4.pth pretrained model. In fact, I just clone this repo and modify the path in the train_ESRGAN.json! The following is my train_ESRGAN.json: { "name": "001_RRDB_ESRGAN_x4DIV2K" // please remove "debug" during training , "use_tb_logger": true , "model":"srragan" , "scale": 4 , "gpu_ids": [0]

, "datasets": { "train": { "name": "DIV2K" , "mode": "LRHR" , "dataroot_HR": "/home/liushuzhi/Desktop/fuxian/superresolution_dataset/DIV2K/DIV2K_train_HR_sub_image" , "dataroot_LR": "/home/liushuzhi/Desktop/fuxian/superresolution_dataset/DIV2K/DIV2K_train_LR_sub_image" , "subset_file": null , "use_shuffle": true , "n_workers": 8 , "batch_size": 16 , "HR_size": 128 , "use_flip": true , "use_rot": true } , "val": { "name": "val_set14_part" , "mode": "LRHR" , "dataroot_HR": "/home/liushuzhi/Desktop/fuxian/superresolution_dataset/Set14/Set14_raw_image" , "dataroot_LR": "/home/liushuzhi/Desktop/fuxian/superresolution_dataset/Set14/Set14_LR_image" } }

, "path": { "root": "/home/liushuzhi/Desktop/fuxian/BasicSR-master" // , "resume_state": "../experiments/debug_002_RRDB_ESRGAN_x4_DIV2K/training_state/16.state" , "pretrain_model_G": "../experiments/pretrained_models/RRDB_PSNR_x4.pth" }

, "network_G": { "which_model_G": "RRDB_net" // RRDB_net | sr_resnet , "norm_type": null , "mode": "CNA" , "nf": 64 , "nb": 23 , "in_nc": 3 , "out_nc": 3 , "gc": 32 , "group": 1 } , "network_D": { "which_model_D": "discriminator_vgg_128" , "norm_type": "batch" , "act_type": "leakyrelu" , "mode": "CNA" , "nf": 64 , "in_nc": 3 }

, "train": { "lr_G": 1e-4 , "weight_decay_G": 0 , "beta1_G": 0.9 , "lr_D": 1e-4 , "weight_decay_D": 0 , "beta1_D": 0.9 , "lr_scheme": "MultiStepLR" , "lr_steps": [50000, 100000, 200000, 300000] , "lr_gamma": 0.5

, "pixel_criterion": "l1"
, "pixel_weight": 1e-2
, "feature_criterion": "l1"
, "feature_weight": 1
, "gan_type": "vanilla"
, "gan_weight": 5e-3

//for wgan-gp
// , "D_update_ratio": 1
// , "D_init_iters": 0
// , "gp_weigth": 10

, "manual_seed": 0
, "niter": 5e5
, "val_freq": 5e3

}

, "logger": { "print_freq": 200 , "save_checkpoint_freq": 5e3 } }

The following is what I got: The visual quality is less satisfying compared with yours! Your result is: baboon_bicLRx4_rlt My result is: baboon_bicLRx4_rlt The ground truth is: baboon As for the PSNR and SSIM, I got PSNR_Y: PSNR: 24.976743 dB; SSIM: 0.673742 when I tested on Set14, but when I tested your model I got PSNR: 24.495635 dB; SSIM: 0.654736.

In the issues you mentioned that the downsampling filter implemented by python is different from the one implemented by matlab. Do you think this may cause my problem? I think your method should work for both two filters! You mentioned that you used more data in your paper while I just used DIV2K. Is this very important? If I used more data, it would take too much time to train!

xinntao commented 5 years ago

For me, the results seem OK if you use only DIV2K dataset. I think the extra gap is from the datasets. Using more data will not increase the time compared with your current settings. Because the training is counted by iterations instead of epochs.

I do not think the filter is the key factor.

Feiyu-Zhang commented 5 years ago

I got unsatisfying results when trying to reproduce your results!

The following is what I did: I used the matlab downsample kernel, and I only used the DIV2K as the training dataset . I used the GAN-loss and the RRDB_PSNR_x4.pth pretrained model. In fact, I just clone this repo and modify the path in the train_ESRGAN.json! The following is my train_ESRGAN.json: { "name": "001_RRDB_ESRGAN_x4DIV2K" // please remove "debug" during training , "use_tb_logger": true , "model":"srragan" , "scale": 4 , "gpu_ids": [0]

, "datasets": { "train": { "name": "DIV2K" , "mode": "LRHR" , "dataroot_HR": "/home/liushuzhi/Desktop/fuxian/superresolution_dataset/DIV2K/DIV2K_train_HR_sub_image" , "dataroot_LR": "/home/liushuzhi/Desktop/fuxian/superresolution_dataset/DIV2K/DIV2K_train_LR_sub_image" , "subset_file": null , "use_shuffle": true , "n_workers": 8 , "batch_size": 16 , "HR_size": 128 , "use_flip": true , "use_rot": true } , "val": { "name": "val_set14_part" , "mode": "LRHR" , "dataroot_HR": "/home/liushuzhi/Desktop/fuxian/superresolution_dataset/Set14/Set14_raw_image" , "dataroot_LR": "/home/liushuzhi/Desktop/fuxian/superresolution_dataset/Set14/Set14_LR_image" } }

, "path": { "root": "/home/liushuzhi/Desktop/fuxian/BasicSR-master" // , "resume_state": "../experiments/debug_002_RRDB_ESRGAN_x4_DIV2K/training_state/16.state" , "pretrain_model_G": "../experiments/pretrained_models/RRDB_PSNR_x4.pth" }

, "network_G": { "which_model_G": "RRDB_net" // RRDB_net | sr_resnet , "norm_type": null , "mode": "CNA" , "nf": 64 , "nb": 23 , "in_nc": 3 , "out_nc": 3 , "gc": 32 , "group": 1 } , "network_D": { "which_model_D": "discriminator_vgg_128" , "norm_type": "batch" , "act_type": "leakyrelu" , "mode": "CNA" , "nf": 64 , "in_nc": 3 }

, "train": { "lr_G": 1e-4 , "weight_decay_G": 0 , "beta1_G": 0.9 , "lr_D": 1e-4 , "weight_decay_D": 0 , "beta1_D": 0.9 , "lr_scheme": "MultiStepLR" , "lr_steps": [50000, 100000, 200000, 300000] , "lr_gamma": 0.5

, "pixel_criterion": "l1"
, "pixel_weight": 1e-2
, "feature_criterion": "l1"
, "feature_weight": 1
, "gan_type": "vanilla"
, "gan_weight": 5e-3

//for wgan-gp
// , "D_update_ratio": 1
// , "D_init_iters": 0
// , "gp_weigth": 10

, "manual_seed": 0
, "niter": 5e5
, "val_freq": 5e3

}

, "logger": { "print_freq": 200 , "save_checkpoint_freq": 5e3 } }

The following is what I got: The visual quality is less satisfying compared with yours! Your result is: baboon_bicLRx4_rlt My result is: baboon_bicLRx4_rlt The ground truth is: baboon As for the PSNR and SSIM, I got PSNR_Y: PSNR: 24.976743 dB; SSIM: 0.673742 when I tested on Set14, but when I tested your model I got PSNR: 24.495635 dB; SSIM: 0.654736.

In the issues you mentioned that the downsampling filter implemented by python is different from the one implemented by matlab. Do you think this may cause my problem? I think your method should work for both two filters! You mentioned that you used more data in your paper while I just used DIV2K. Is this very important? If I used more data, it would take too much time to train!

Hi~How many epochs will roughly Set14(e.g. Monkey image) occur? I approximately trained 5000 epoch. I got the image as follow. Are you the same? baboon_bicLRx4_265000

xinntao commented 5 years ago

@Feiyu-Zhang 1) 5k is not enough, at least, > 300k. 2) Your figure indicates there must be something wrong, which is not from inadequate training.

Feiyu-Zhang commented 5 years ago

@Feiyu-Zhang

  1. 5k is not enough, at least, > 300k.
  2. Your figure indicates there must be something wrong, which is not from inadequate training.
  1. Oh~ There could be a misunderstanding. I mean 5K epochs.
  2. I will check the code. Thanks!
Slayerxxx commented 3 years ago

@Feiyu-Zhang

  1. 5k is not enough, at least, > 300k.
  2. Your figure indicates there must be something wrong, which is not from inadequate training.
  1. Oh~ There could be a misunderstanding. I mean 5K epochs.
  2. I will check the code. Thanks!

I guess @xinntao means 300k iterations at least to produce satisfactory results