azraelkuan / tensorflow_wavenet_vocoder

wavenet vocoder using tensorflow
27 stars 10 forks source link
python3 speech-synthesis tensorflow vocoder wavenet wavenet-vocoder

an implement of wavenet vocoder using tensorflow

!!! the audio code is copied from wavenet_vocoder !!!

!!! the main tensorflow model is fixed from tensorflow-wavenet !!!

Some issue

mixture is in the branch of dev, but there are some bugs in generating wavs.

To Do

Required

Getting Start

Download dataset

Preprocess data

for train faster, we should process the data to npy

python preprocess.py --num_workers 4 --name ljspeech --in_dir /your_path/LJSpeech-1.0 --out_dir /your_outpath/ --hparams sample_rate=22050

Training

for single speaker

python train.py --num_gpus 4 --batch_size 2 --train_txt /your_train_txt/ --hparams gc_enable=False,global_channel=0,global_cardinality=0,NPY_DATAROOT=/your_npy_datadir/,sample_rate=22050 --logdir_root log_ljspeech

for multi speaker

python train.py --batch_size 2 --num_gpus 4 --train_txt /your_train_txt/ --logdir_root log_arctic

Synthesize

for single speaker

the eval_txt is extracted from the train_txt

python mul_generate.py --eval_txt /your_eval_txt/ --wav_out_path test_ljspeech.wav /your_cheakpoint/ ---hparams gc_enable=False,global_channel=0,global_cardinality=0,NPY_DATAROOT=/your_npy_datadir/,sample_rate=22050

for multi speaker

python mul_generate.py --eval_txt /your_eval_txt/ --wav_out_path test_arctic.wav /your_checkpoint/ --gc_id 6