dxyang / StyleTransfer

Implementation of "Perceptual Losses for Real-Time Style Transfer and Super-Resolution" in PyTorch
288 stars 69 forks source link

Style Transfer

Descriptions

This project is a PyTorch implementation of Perceptual Losses for Real-Time Style Transfer and Super-Resolution. This paper trains an image transformation network to perform style transfer as opposed to optimizing along the manifold of images as originally propsed by Gatys et al..

The image transformation network is shown below. For a given style image, the network is trained using the MS-COCO dataset to minimize perceptual loss while being regularized by total variation. Perceptual loss is defined by the combination of feature reconstruction loss as well as the style reconstruction loss from pretrained layers of VGG16. The feature reconstruction loss is the mean squared error between feature representations, while the style reconstruction loss is the squared Frobenius norm of the difference between the Gram matrices of the feature maps.

Prerequisites

Usage

Train

You can train a model for a given style image with the following command:

$ python style.py train --style-image "path_to_style_image" --dataset "path_to_coco"

Here are some options that you can use:

So to train on a GPU with mosaic.jpg as my style image, MS-COCO downloaded into a folder named coco, and wanting to visualize a sample image throughout training, I would use the following command:

$ python style.py train --style-image style_imgs/mosaic.jpg --dataset coco --gpu 1 --visualize 1

Evaluation

You can stylize an image with a pretraind model with the following command. Pretrained models for mosaic.jpg and udine.jpg are provided.

$ python style.py transfer --model-path "path_to_pretrained_model_image" --source "path_to_source_image" --target "name_of_target_image"

You can also specify if you would like to run on a GPU:

For example, to transfer the style of mosaic.jpg onto maine.jpg on a GPU, I would use:

$ python style.py transfer --model-path model/mosaic.model --source content_imgs/maine.jpg --target maine_mosaic.jpg --gpu 1

Results

Mosaic

Model trained on mosaic.jpg applied to a few images:

And here is a GIF showing how the output changes during the training process. Notably, the network generates qualitatively appealing output within a 1000 iterations.

Udine

Model trained on udine.jpg applied to a few images:

And here is a GIF showing how the output changes during the training process. Notably, the network generates qualitatively appealing output within a 1000 iterations.

Acknowledgements