jcjohnson / neural-style

Torch implementation of neural style algorithm
MIT License
18.31k stars 2.7k forks source link

Weak Activations? #103

Open 3DTOPO opened 8 years ago

3DTOPO commented 8 years ago

activationexamples

It seems that where there is a strong correlation between the style and input image we get results like the sample on the left, and where there is a weak match between the two we get faded results like on the right.

I have tried playing with various parameters but can't seem to shake this. It seems that where the NN is very excited the colors are saturated and when it's not, the colors are desaturated. Anyone have ideas how to balance this effect?

alexjc commented 8 years ago

This is by far the trickiest part of any StyleNet application.

Are you using normalization of the activations like in Anders' implementation (there's a flag)? From the original paper, they suggest pre-normalizing the network itself, and there's a new version of the VGG model.

3DTOPO commented 8 years ago

Thanks for your suggestions. Yes I have tried the normalized flag but haven't been able to get anything looking half decent. I will try installing the latest model - do you have a link by chance?

In a crude test it seems that if the under-excited regions where just saturated more (and not affect saturation elsewhere) it seems like it would be a big improvement.

alexjc commented 8 years ago

The problem is that the gram matrices give you a "global" measure of style for the whole picture, so at that level you can't easily control the saturation/contrast per pixel or per region. To fix activation problems dynamically, it seems this is a clue: https://bethgelab.org/deepneuralart/

The paper that @leongatys refers to there in turn refers to another paper from 2012 that talks about per-pixel "Local Response Normalization" which may help. I have yet to look into this.

EDIT: I found dynamic normalization (each forward pass) to cause other problems, e.g. for sketches or cartoons. Though it works in some cases.

3DTOPO commented 8 years ago

Cool, thanks, I'll check it out more. Has anyone tried the normalized network they used in the link provided?

Is there a way to simply produce an activation or excitation map for the whole picture? Such a map could then be used to adjust the image with post-processing.

Could you please explain more about what you did to implement dynamic normalization?

alexjc commented 8 years ago

I haven't implemented dynamic normalization. I'm looking into the whole topic now. I'm currently using the normalized network but it still has problems. Something isn't quite right in the whole setup and in some cases I get the same artefacts as you.

To produce an excitation map, you basically need to gather all the intermediate arrays, resize them, then add them all up to see how they map to the original image. I don't know Torch well enough to do this in Lua though, my code is in Python.

3DTOPO commented 8 years ago

My apologies @alexjc; it turns out the download_models.sh script downloads the same normalized model referenced from Bethge et al.

But could you please provide links to the newest VGG model you referenced? I have dug around, but its a bit confusing because I can't find any sort of version number or date for the models, and it seems like they are served from private git repositories, making it difficult to look.

3DTOPO commented 8 years ago

Also, can anyone point me to the normalized VGG-19 model? The instructions state: "Default is the original VGG-19 model; you can also try the normalized VGG-19 model used in the paper.". I looked in the paper and it states that the model is publicly available but does not provide the source as far as I can tell (it does however reference several other papers).

alexjc commented 8 years ago

The "new" one is here: https://bethgelab.org/deepneuralart/ By default all open source implementations seem to use the default VGG.

3DTOPO commented 8 years ago

Thanks, but that is the same model downloaded by download_models.sh.

I think I misspoke about the normalized model not being downloaded from download_models.sh. We just needed to specify the model (which appears to be the same one you link to):

-model_file models/vgg_normalised.caffemodel