Question about improved style loss

velikodniy commented 7 years ago

Could you briefly explain, please, why you are using difference between sl1 and sl2 here? I cannot find this approach in the article.

And why do you divide by powers of 2 here, although the article proposes to use they as weights?

titu1994 commented 7 years ago

SL1 and SL2 difference is chained correlation. See section 3.5.

That is the geometric weighing scheme. The weight vales are given in section 3.1.

velikodniy commented 7 years ago

SL1 and SL2 difference is chained correlation. See section 3.5.

But sometimes the difference sl1[j] - sl2[j] is negative. Is it ok?

That is the geometric weighing scheme. The weight vales are given in section 3.1.

But why do you divide by powers of two? As far as I understand, the point is to increase weights of the first layers which are representing low-level features like color or simple short lines.

Nevertheless, your implementation shows great results. Maybe I misunderstood the article.

titu1994 commented 7 years ago

Is it ok?

Yes. Performing K.abs(...) over it has no impact. In the case of style image being The Starry Night, it blurs the image somewhat when using the absolute value.

But why do you divide by powers of two?

The idea is to increase weights of initial layers, but in practice when I implemented it, the weights for the initial layers blew up, and completely overwhelmed the generally smaller weights of the last layers and no style transfer took place at all.

The modification I made is to drastically weaken the strength of the initial layers (which have very large strength but more content transfer than style transfer) and to keep the original strength of the final layers (which have much less strength but most style transfer than content transfer).

With this, I basically normalize the strengths of each of the layers, i.e. the initial layers with very strong content weight have same importance as the final layers with very strong style weight. This leads to a more uniform transfer, and almost completely avoids cases where the style transfer would fail.

This also allows content layer to be any of the 19 layers, since its strength will be normalized anyway. With the content layer as conv5_2 rather than conv4_2, faster convergence can be obtained with better results.

Edit: As an additional bonus, the user specified weights for content and style are quite a bit more stable, such that small changes wont affect the image much. You would need to change the order of magnitude of the content or style weight to see some visual results.

As such, style loss can be increased from 100 to 1000 to have a much stronger style, but changing from 100 to 105 will not have much effect at all.

titu1994 commented 7 years ago

On a side note, there are many algorithms of style transfer out there which trade of style for speed. I focus more on style, so whatever can improve the style will be preferred, even if it increases computation time or goes against a paper's recommendation.

The fast style transfer algorithms may technically be performing artistic style transfer, but the resultant image is by no means artistic. They looks like a shoddy shape and color transfer, or an imitation by a clumsy artist in comparison the the style image used for that purpose.

velikodniy commented 7 years ago

Thank you so much for you reply!

titu1994 commented 7 years ago

No problem!

titu1994 / Neural-Style-Transfer

Question about improved style loss #38