tensorlayer / SRGAN

Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network
https://github.com/tensorlayer/tensorlayerx
3.29k stars 810 forks source link

why 2e-6 is multiplied in the vgg loss? #194

Open sdlpkxd opened 4 years ago

sdlpkxd commented 4 years ago

Excuse me.

I am confused by the vgg_loss, why 2e-6 is multiplied in the vgg loss? vgg_loss = 2e-6 * tl.cost.mean_squared_error(feature_fake, feature_real, is_mean=True)

I try to remove the 2e-6 in vgg loss, but the tl.cost.mean_squared_error() is too large, even over 1e6 level. Does the feature difference is very big ?

Could anyone help me understand this, or give me some advices ?

zsdonghao commented 4 years ago

just a hyperparameter from the paper

sdlpkxd commented 4 years ago

Thank you for your reply.

There is one thing I want to know. The mse loss for the pred-image and label image is around 1e-1, but the vgg feature loss is around 1e6, why there is such big difference between the features? What's the relationship between the feature loss and the feature level ?

Hope for replies. Thanks.

15732031137 commented 4 years ago

@sdlpkxd @zsdonghao Hello! I have seen several versions of the code, where vgg_loss is multiplied by 2e-6, but the paper says that formula 5 (that is, vgg_loss) is multiplied by 0.006. What is going on? Thank you. I wish you a happy life and academic progress!

KhushbooChauddhary commented 2 years ago

Total perceptual loss in SRGAN paper is weighted sum of content loss and adversarial loss.

Total loss = Content loss + (10^(-3)) Adversarial loss Please tell why 10^(-3) is used? what is its impact on performance if some other value is used ?? or does it affects number of iterations for training of network?