Implementation of Tacotron 2
This is a tensorflow implementation of NATURAL TTS SYNTHESIS BY CONDITIONING WAVENET ON MEL SPECTROGRAM PREDICTIONS.
Initially I will use existing components from tacotron and other opensource implementations
Main configuration is first to decide the 'run options' inside hyperparams.py
First run 'python prepro.py' to generate the training data
Then run 'python train.py' for the actual training/generation/loading of model/samples. Typical usages:
python train.py --log_dir=logs --log_name=test --data_paths=datasets/default --deltree=True
, logs defines log output directory, log_name defines name of current run, data_paths the directory of the training data, deltree to delete folderpython train.py --log_dir=logs --log_name=test --data_paths=datasets/default --load_path=logs/test
, load_path the folder to load previous trained modelpython train.py --log_dir=logs --log_name=test --data_paths=datasets/default --load_path=logs/test --load_converter=logs/converter
, load_converter the folder to load pretrained converterHyperparameters for training and testing:
For better training decouple the phase into 3 steps:
python train.py --log_dir=logs --log_name=Encoder --data_paths=datasets/default --deltree=True
, train the encoder alone. For this set [train_form='Encoder']python train.py --log_dir=logs --log_name=Converter --data_paths=datasets/default --deltree=True
, train the converter alone. For this set [train_form='Converter']python train.py --log_dir=logs --log_name=Encoder --data_paths=datasets/default --load_path=logs/Encoder --load_converter=logs/Converter
, Generate samples using trained econder and converter. For this set [train_form='Encoder' and 'test_only = 1']To be posted in a few days (work in progress)
LJ Speech Dataset(https://keithito.com/LJ-Speech-Dataset)
A lot of the base work has been taken from Kubyong Park's (kbpark.linguist@gmail.com) implementation of Deep Voice 3 (https://www.github.com/kyubyong/deepvoice3) and also Rayhane Mama's (https://www.linkedin.com/in/rayhane-mama/) implementation of Tacotron 2 (https://github.com/Rayhane-mamah/Tacotron-2) for the Attention Mechanism and wrapper
Feel free to reach out to me if you want to contribute (dimitris@rsquared.io)