titu1994 / Neural-Style-Transfer

Keras Implementation of Neural Style Transfer from the paper "A Neural Algorithm of Artistic Style" (http://arxiv.org/abs/1508.06576) in Keras 2.0+
Apache License 2.0
2.26k stars 481 forks source link

problem with Theano backend #26

Closed ink1 closed 7 years ago

ink1 commented 7 years ago

FYI, there seems to be an issue with Keras and Theano which causes an error when using masks and Theano backend.

https://github.com/fchollet/keras/issues/4635

titu1994 commented 7 years ago

I'm not using the Masking layer in Keras. Masking is handled in the script by multiplying each filter of each of the style layers by a binary mask, and therefore there is no problem with output shape.

I have generated all the masked images in the readme with Theano backend, so I don't see any issue.

I also suggest not to use masks when creating images. I have provided a post processing script masked_style_transfer.py which provides similar results after the image has been created, and is several times faster. The drawback is that since it is a post processing script, there is a sharp transition between the content image and the generated image.

ink1 commented 7 years ago

Hi I'm referring to the masks which are applied to style images. When you call style_loss you pass a layer shape which is incorrect when using Theano backend. I don't mind this issue being closed since there is probably very little you can do. Recreating the shape through other means is probably not a good idea.

The drawback is that since it is a post processing script, there is a sharp transition between the content image and the generated image.

This is exactly why I need multiple masking.

titu1994 commented 7 years ago

I have tried multiple style transfer as well. It is tested with 2 styles. In 3 styles, one region will have two overlapping styles, in 4 styles, both regions will have 2 overlapping styles. The multiple style transfer example shown is of two styles in two different regions.

In the case of 4 styles - binary mask, it would be as if a certain region has two styles overlapping. That region will then follow the two style transfer in each region. Of course, the time required for just 2 style transfer is nearly 300 seconds per iteration, with 4 or more styles I can assume well over 20 minutes per iteration. All of this is tested on Theano only. I am on windows, so Theano was the only available backend for Keras for some time. Now that Tensorflow is on windows as well, I can test on both.

titu1994 commented 7 years ago

On the point of multiple masking, it can be generalized to something other than a binary mask. Currently, I need to only supply two masks - a binary mask and its inverse to describe which style should work on each region.

Now generalizing this to n masks with n styles. Each mask will have a white region designating that style transfer should occur and a black region where style transfer must not occur. Each mask must be made in such a way that none of the n masks have overlapping regions. Then the script multiplies the ith binary mask to the ith style and thus you could have true 'n' masked style transfer in a single image. This would work only on the assumption that the n masks are truly non overlapping. Even then a certain amount of style bleeding may occur.

This is what is currently done by the Network.py and INetwork.py scripts. They assert that there are n masks for n styles, but do not check that the masks are non overlapping as this would be overly complex and out of scope of the style transfer program. With the assumption that there is a 1 - 1 mapping between provided style and provided mask, they multiply the ith style with the ith mask and create the style loss.

Note that the post processing script is a restricted version of this. It supports only 1 content 1 style mask transfer or 2 styles over inverted regions. You can accomplish two style transfer over inverted regions via the post processing script easily. Steps :

  1. Perform normal style transfer with style 1 on content image (no mask) -- image 1
  2. Perform normal style transfer with style 2 on content image (no mask) -- image 2
  3. Use masked_style_transfer.py, with content image as image 1 and generated image as image 2 and provide a required mask.

This will however lead to shape transitions between the two style so you could use some image edition tool to merge the boundary (transition) regions.

ink1 commented 7 years ago

I'm happy to discuss masking because that's what I'm working on atm but we are getting off topic here.

Just to be clear: I'm on Linux and using Python 2.7 primarily on CPU only. I'm working on a code which is based on your INetwork.py code (which I have re-factored because it's too big). What you have described above is basically correct and I have implemented that using style files of sizes not related to the input (what you call base) image. I think pretty much anything goes as long as you can create an appropriate loss function. I want the styles to be applied to specific regions and the result to be blended organically through the iterative optimisation process.

  1. All my masks have the size of the input image
  2. I'm not concerned whether the masks overlap or not. Given the mask weights, you can play with these too.
  3. I agree, non-binary masks are interesting but I can also see this heading in the direction of automatic image segmentation and styles applied to appropriate segments.
  4. It is essential for me that the style files have arbitrary sizes. Clearly, to a certain extent, their sizes are unrelated to the image we want to generate. Gram matrices are created for each style during the initialisation step and then used during the loss calculations.

The code works but only as long as the style size is the same as input even though Keras models for styles are different from the Keras model for input. When the size is different I'm getting

raise ValueError("GraphDef cannot be larger than 2GB.")
ValueError: GraphDef cannot be larger than 2GB.

The style models are just like the ones that you have except input_tensor contains only one image and I repeat that for each style until all Gram matrices are computed. The error above is not very helpful and I suspect there might be something going on in Keras or deeper which links the models.

titu1994 commented 7 years ago

It's true that Keras doesn't handle different image sizes in a an already compiled model (which has been specified an input size).

Or this may also turn out to be a purely memory issue? I have a 4 GB graphics card and have experienced similar errors when training very large models with large input sizes (512x512 or 384x384), and it said 2GB then as well.

I am closing the issue for now, but if you obtain a feasible solution in either Keras or Theano, please do post it here.