jcjohnson / neural-style

Torch implementation of neural style algorithm
MIT License
18.31k stars 2.7k forks source link

How to understand TVLoss? #302

Open EthanZhangYi opened 8 years ago

EthanZhangYi commented 8 years ago

This is the first time that I use Torch and Lua. I read the CVPR paper Image Style Transfer Using Convolutional Neural Networks and code neural_style.lua of this repository. I can not understand the TVLoss module in the code. What is it used for? I do not find any description or discussion on the TVLoss in the CVPR paper. Only Content Loss and Style Loss are proposed in the paper. Could anyone give me some help?

EthanZhangYi commented 8 years ago

@jcjohnson Could you please give me some help?

jcjohnson commented 8 years ago

The total variation (TV) loss encourages spatial smoothness in the generated image. It was not used by Gatys et al in their CVPR paper but it can sometimes improve the results; for more details and explanation see Mahendran and Vedaldi "Understanding Deep Image Representations by Inverting Them" CVPR 2015.

EthanZhangYi commented 8 years ago

Thank you! @jcjohnson Your answer is so helpful for me.

bmaltais commented 8 years ago

I ran some test where I produce the same style transfer using various tv values. I noted the total style loss and noticed that the smallest the loss for the style the better looking the resulting image was. Here are the results I got:

tv-value,iter 50, iter 100 ...
-------------------------------------------------------
0.000085, 29815,8702,3179,1257,552
0.0000850051,28868,8288,3101,1330,563
0.00008505,31854,8479,3080,1432,603
0.0000851,31620,8698,2954,1168,554
0.0000851035,31432,8940,3100,1308,566
0.0000855,31894,8308,3174,1295,584
0.000086, 33001,9533,3212,1334,607
0.0000875,27660,8669,3362,1370,614
0.00009,  29824,9244,3373,1456,603
0.00010,  30223,8553,3381,1361,634
0.0002000,29600,10388,3968,1844,868
0.00100,  48000,22791,14244,10523,7780

So based on my testing the best tv value was 0.000085

Results may vary based on src and dst images... but the nice thing is that you only need something like 250 iterations to pick the winner... so try 0.000085, 0.0001 or 0.0002 and see which is best for you.