Elin24 / PSPL

The implementation of ICASSP 2020 paper "Pixel-level self-paced learning for super-resolution"
40 stars 13 forks source link

Question regarding the effect of the similarity map #2

Open drcdr opened 4 years ago

drcdr commented 4 years ago

I was curious as to what the effect of the similarity map was, so I added a few lines of code to the forward function of the Loss class, to write out sr[0] and hr[0] patches before and after multiplication by weight=gauss(ssim).detach(), for batch=1 of each epoch. My training command was:

python main.py --model EDSR --scale 4 --data_test Set5+Set14+B100+Urban100+DIV2K --n_GPUs 1 --epochs 300

For clarification, all arguments are:

Namespace(G0=64, RDNconfig='B', RDNkSize=3, act='relu', batch_size=16, betas=(0.9, 0.999), chop=False, cpu=False, data_range='1-800/801-900', data_test=['Set5', 'Set14', 'B100', 'Urban100', 'DIV2K'], data_train=['DIV2K'], debug=False, decay='200', dilation=False, dir_data='../x_imagedata', dir_demo='../test', disable_PSPL=False, epochs=300, epsilon=1e-08, ext='sep', extend='.', gamma=0.5, gan_k=1, gclip=0, load='', loss='1*L1', lr=0.0001, model='EDSR', momentum=0.9, n_GPUs=1, n_colors=3, n_feats=64, n_layers=8, n_resblocks=16, n_resgroups=10, n_threads=6, negative_slope=0.2, no_augment=False, optimizer='ADAM', patch_size=192, pre_train='', precision='single', print_every=250, reduction=16, res_scale=1, reset=False, resume=0, rgb_range=255, save='EDSR_04-08_22-15-40', save_gt=False, save_models=False, save_results=False, scale=[4], seed=1, self_ensemble=False, shift_mean=True, skip_threshold=100000000.0, splalpha=0.3, splbeta=0, split_batch=1, splval=2, template='.', test_every=1000, test_only=False, weight_decay=0)

My evaluation results were;

  [Set5 x4]     PSNR: 32.076 (Best: 32.134 @epoch 268)  ssim=0.896102
  [Set14 x4]    PSNR: 28.535 (Best: 28.568 @epoch 267)  ssim=0.785463
  [B100 x4]     PSNR: 27.539 (Best: 27.547 @epoch 257)  ssim=0.743243
  [Urban100 x4] PSNR: 25.956 (Best: 25.961 @epoch 293)  ssim=0.785183
  [DIV2K x4]    PSNR: 28.897 (Best: 28.903 @epoch 257)  ssim=0.837567

Here is what the images look like, as the epochs change, for just a few epochs. From left to right, these are sr[0], hr[0], sr[0]weight, hr[0]weight.

Picture1

Is this about what you'd expect? They just seemed a little noisier to me than, e.g., Figure 2 in the paper. I can also try the training commands that you used in #1; the x2 case is running now, it looks like it'll take about 2.5 days...

Elin24 commented 4 years ago

emmm ....

Firstly, the EDSR you use is a simplified one (n_feats=64, n_resblocks=16), the best (or the biggest) one is n_feats = 256, n_resblocks = 32 as shown in EDSR_paper.

Secondly, the weighted HR and SR images do not contain visual values. The value of them is the way it changes the loss (give more weights on hard pixels). The corresponding images in my paper only use single-channel SSIM maps for a better exhibition. However, 3-channel ones work better than it.

As for training time, PSPL's function is to prompt convergence, as explained in the ablation experiment.