jcjohnson / neural-style

Torch implementation of neural style algorithm
MIT License
18.3k stars 2.71k forks source link

Use GAN to calculate style loss #441

Open citymonkeymao opened 6 years ago

citymonkeymao commented 6 years ago

Learn from multiple styles with GAN

When using hundreds of pictures as style images, a discriminator could be used to calculate the style loss. The discriminator takes gram matrix as input and was trained to tell whether the generated image belongs to the target style.

The traditional way of calculating sytle loss:

The new way of calculating style loss:

Results

Imitate Shinkai Makoto Style

Transfered with ~160 high quality style images.

Imitate Monet(Comparing to CycleGAN)

Imitate Vangogh(Comparing to CycleGAN)

Usage

  1. Download style image set(borrowed from CycleGAN): bash ./datasets/download_dataset.sh <dataset name>

    <dataset name> could be monet2photo, vangogh2photo, ukiyoe2photo, cezanne2photo

  2. Do style transfer

    th neural_style.lua -style_image `./list_images.sh <style_image_dir>` -content_<content_image>  -gan -content_weight 2 -style_weight 50000 -image_size 256 -backend cudnn -num_iterations 10000 -d_learning_rate 0.000001`

    -gancommand specifies using Discriminators to calculate style losses. d_learning_rate is the learning rate for Discriminators. list_images.sh helps to list all images in one directory, all files in that directory should not contain space and style_image_dirshould not contain~. You need to play with parameters for different style and size.

    example

    Transfer fj.jpg to vangogh style

  3. Download vangogh's painting bash ./datasets/download_dataset.sh vangogh2photo
  4. Add styles to image
    th neural_style.lua -style_image `./list_images.sh datasets/vangogh2photo/trainA
    ` -content_image data/fj.jpg  -gan -content_weight 1 -style_weight 50000 -image_size 256 -backend cudnn -num_iterations
    10000 -d_learning_rate 0.0000001
Naruto-Sasuke commented 6 years ago

Hi, it's interesting. what are related papers of your code. I wanna take a look.

citymonkeymao commented 6 years ago

I didn't see any paper describing this yet. However, you can look at this paper that proposed GAN. Here I use GAN to capture the distribution of gram matrices.

Naruto-Sasuke commented 6 years ago

I think it's better to calculate correct prediction percent of D for fakes and reals.

citymonkeymao commented 6 years ago

The correct rates of discriminators fluctuate except the 4th style layer. the discriminator can always success on this layer. I guess this is because this layer is also used as the content layer.

ProGamerGov commented 6 years ago

So what the limitations of your idea here? Is the image size limited to 256 like it is in CycleGAN? Do we need a thousand image dataset and pretrained CycleGAN model for each artists' style?

citymonkeymao commented 6 years ago

Like the original style transfer, you can decide image size by using -image_size option.The only limitation for this is your memory.

I'm not sure the lower limit of the images needed, the Shin style demonstrated there used 170 images.