The -original_colors options

ghost commented 7 years ago

The slow neural style has this "-original_colors" options which can preserve the original color. It is not available for the fast neural style. Why? Also what is the theory behind the "-original colors"

htoyryla commented 7 years ago

The idea is very simple, preserve the colors of the original image by copying the chrominance from the content image into the result.

I have written a script to do this after having run e.g. fast-neural-style: https://gist.github.com/htoyryla/147f641f2203ad01b040f4b568e98260

Usage:

th original-colors -generated_image <output from fast-neural-style> -original_image <original content image> -output_image <output image>

ghost commented 7 years ago

This is cool. However, this is pixel-wise color preservation. I thought it was from some other approach.

I always wanted to understand why the "Gram matrix" represented the style of an images. From my testing, fast neural style is very power to transfer the color theme, but not always the "style", at least the "style" in my mind. Slow neural style is better in terms of style, but still, color is the dominant changes in the resulted image.

I think we should look if there is other matrix to represent the style. For example, is that possible to extract separately the color and the "style"?

I found Prisma is doing a better job in term of this.

htoyryla commented 7 years ago

This paper seems to be about why the Gram matrix works: https://arxiv.org/pdf/1701.01036v1.pdf

I have also felt that the original neural-style was better in terms of style transfer, but I guess the difference is mainly because the original neural style used default image size of 512x512, which has subsequently been found to be the optimal image size for style transfer using VGG19.

As to Prisma, I believe that it is using pixelwise color channels blending (among other things).

PS. Convnets like VGG were not designed to preserve color information, I guess, but classify images by detecting objects in them. Color information exists explicitly only in layer conv1_1, above that layer we have feature maps which learn features in order to perform well in classifying images. The feature maps obviously respond somehow to color, too, when they detect features and objects, but whether color and style would be separable above the lowest layers --it feels somehow against what the convnet has been trained to do, and even if it were possible, how do we know in which form the layers have internalized color information?

ghost commented 7 years ago

so you do not believe Prisma uses neural style? I believe they use some kind of slow neural style, with customized optimization.

htoyryla commented 7 years ago

I only said that Prisma quite likely uses pixelwise channel blending in addition to any neural procedure (others have thought so too). Neuralwise, I guess it must use a feed-forward model (fast but limited to pretrained styles) rather than an iterative optimization (slow but unlimited possibilities).

My PS then contained my thoughts on your 'for example, is that possible to extract separately the color and the style?'

htoyryla commented 7 years ago

I wrote "Color information exists explicitly only in layer conv1_1, above that layer we have feature maps which learn features in order to perform well in classifying images."

This also means that one can experiment using only relu1_1 for style layer, if one wants to experiment using colors from the style image statistically. It works somehow, although the results might not be esthetically pleasing (see below). Likewise, if one wants to experiment with "separating color and style" one might try to use relu1_1 together with some higher layer and give them different weights. I can't say if it works, as relu1_1 is not only about color but about pixel related aspects in general.

out www_out_700 kafka_out_650 rothko_colors_out_600

universewill commented 5 years ago

@htoyryla That's inspiring. Thank u.

jcjohnson / fast-neural-style

The -original_colors options #96