Multi-layer Recurrent Neural Networks (LSTM, RNN) for character-level language models in Python using Tensorflow.
Inspired from Andrej Karpathy's char-rnn.
To train with default parameters on the tinyshakespeare corpus, run python train.py
. To access all the parameters use python train.py --help
.
To sample from a checkpointed model, python sample.py
.
Sampling while the learning is still in progress (to check last checkpoint) works only in CPU or using another GPU.
To force CPU mode, use export CUDA_VISIBLE_DEVICES=""
and unset CUDA_VISIBLE_DEVICES
afterward
(resp. set CUDA_VISIBLE_DEVICES=""
and set CUDA_VISIBLE_DEVICES=
on Windows).
To continue training after interruption or to run on more epochs, python train.py --init_from=save
You can use any plain text file as input. For example you could download The complete Sherlock Holmes as such:
cd data
mkdir sherlock
cd sherlock
wget https://sherlock-holm.es/stories/plain-text/cnus.txt
mv cnus.txt input.txt
Then start train from the top level directory using python train.py --data_dir=./data/sherlock/
A quick tip to concatenate many small disparate .txt
files into one large training file: ls *.txt | xargs -L 1 cat >> input.txt
.
Tuning your models is kind of a "dark art" at this point. In general:
To visualize training progress, model graphs, and internal state histograms: fire up Tensorboard and point it at your log_dir
. E.g.:
$ tensorboard --logdir=./logs/
Then open a browser to http://localhost:6006 or the correct IP/Port specified.
Please feel free to: