Chung-I / Variational-Recurrent-Autoencoder-Tensorflow

A tensorflow implementation of "Generating Sentences from a Continuous Space"
228 stars 72 forks source link
recurrent-neural-networks tensorflow vae variational-autoencoder

Gerating Sentences from a Continuous Space

Tensorflow implementation of Generating Sentences from a Continuous Space.

Prerequisites

  1. Python packages:
    • Python 3.4 or higher
    • Tensorflow r0.12
    • Numpy

Setting up the environment:

  1. Clone this repository:
    git clone https://github.com/Chung-I/Variational-Recurrent-Autoencoder-Tensorflow.git
  2. Set up conda environment:
    conda create -n vrae python=3.6
    conda activate vrae
  3. Install python package requirements:
    pip install -r requirements.txt

    Usage

Training:

python vrae.py  --model_dir models --do train --new True

Reconstruct:

python vrae.py --model_dir models --do reconstruct --new False --input input.txt --output output.txt

Sample (this script read only the first line of input.txt, generate num_pts samples, and write them into output.txt):

python vrae.py --model_dir models --do sample --new False --input input.txt --output output.txt

Interpolate (this script requires that input.txt consists of only two sentences; it generate num_pts interpolations between them, and write those interpolated sentences into output.txt)::

python vrae.py --model_dir models --do interpolate --new False --input input.txt --output output.txt

model_dir: The location of the config file config.json and the checkpoint file.

do: Accept 4 values: train, encode_decode, sample, or interpolate.

new: create models with fresh parameters if set to True; else read model parameters from checkpoints in model_dir.

config.json

Hyperparameters are not passed from command prompt like that in tensorflow/models/rnn/translate/translate.py. Instead, vrae.py reads hyperparameters from config.json in model_dir.

Below are hyperparameters in config.json:

Data

Penn TreeBank corpus is included in the repo. We also provide a Chinese poem corpus, its preprocessed version (set {"model":{"data_dir": "<corpus_dir>"}} in <model_dir>/config.json to it), and its pretrained model (set model_dir to it), all of which can be found here.