Open drcdr opened 4 years ago
emmm ....
Firstly, the EDSR you use is a simplified one (n_feats=64, n_resblocks=16), the best (or the biggest) one is n_feats = 256, n_resblocks = 32
as shown in EDSR_paper.
Secondly, the weighted HR and SR images do not contain visual values. The value of them is the way it changes the loss (give more weights on hard pixels). The corresponding images in my paper only use single-channel SSIM maps for a better exhibition. However, 3-channel ones work better than it.
As for training time, PSPL's function is to prompt convergence, as explained in the ablation experiment.
I was curious as to what the effect of the similarity map was, so I added a few lines of code to the forward function of the Loss class, to write out sr[0] and hr[0] patches before and after multiplication by
weight=gauss(ssim).detach()
, for batch=1 of each epoch. My training command was:For clarification, all arguments are:
My evaluation results were;
Here is what the images look like, as the epochs change, for just a few epochs. From left to right, these are sr[0], hr[0], sr[0]weight, hr[0]weight.
Is this about what you'd expect? They just seemed a little noisier to me than, e.g., Figure 2 in the paper. I can also try the training commands that you used in #1; the x2 case is running now, it looks like it'll take about 2.5 days...