llSourcell / how_to_convert_text_to_images

This is the code for "How to Convert Text to Images - Intro to Deep Learning #16' by Siraj Raval on YouTube
MIT License
158 stars 67 forks source link

how_to_convert_text_to_images

This is the code for "How to Convert Text to Images - Intro to Deep Learning #16' by Siraj Raval on YouTube

Coding Challenge - Due Date Thursday, May 4th at 12 PM PST

This weeks coding challenge is to use this code to generate non-bird, non-flower images. Pick a captioned image dataset and train your StackGAN model on it! Post at least one image-text pair you generated in your README. If you want suggestions for a dataset try this or this.

Overview

This is the code for this video on Youtube by Siraj Raval as part of the Intro to Deep Learning Nanodegree with Udacity. This model is called StackGAN and this is the code for for reproducing main results in the paper StackGAN: Text to Photo-realistic Image Synthesis with Stacked Generative Adversarial Networks.

Dependencies

python 2.7

TensorFlow 0.11

[Optional] Torch is needed, if use the pre-trained char-CNN-RNN text encoder.

[Optional] skip-thought is needed, if use the skip-thought text encoder.

pip install the following packages:

Usage

Data

  1. Download our preprocessed char-CNN-RNN text embeddings for birds and flowers and save them to Data/.
    • [Optional] Follow the instructions reedscot/icml2016 to download the pretrained char-CNN-RNN text encoders and extract text embeddings.
  2. Download the birds and flowers image data. Extract them to Data/birds/ and Data/flowers/, respectively.
  3. Preprocess images.
    • For birds: python misc/preprocess_birds.py
    • For flowers: python misc/preprocess_flowers.py

Training

Pretrained Model

Run Demos

Examples for birds (char-CNN-RNN embeddings), more on youtube:

Examples for flowers (char-CNN-RNN embeddings), more on youtube:

Save your favorite pictures generated by the models since the randomness from noise z and conditioning augmentation makes them creative enough to generate objects with different poses and viewpoints from the same description :smiley:

Credits

The credits for this code go to hanzhanggit. I've merely created a wrapper to get people started.