This repository contains Fatcord's Alternative WaveRNN (Faster training), which contains a fast-training, small GPU memory implementation of WaveRNN vocoder.
See geneing's awesome fork that has model pruning, export to C++ and real time inference on CPU: https://github.com/geneing/WaveRNN-Pytorch.
Obama & Bernie Sanders See this repo in action!
10-bit audio on held-out testing data from LJSpeech. This model sounds and trains pretty close to 9 bit. We want the higher bit the better.
9-bit audio on held-out testing data from LJSpeech. This model trains the fastest (this is around 130 epochs)
Single beta distribution on held-out testing data from LjSpeech. This is trained with the single Beta distribution.
hparams.input_type
to raw
.hparams.input_type
to bits
.hparams.py
here, either replace this with your local hparams.py
file or note and update any changes.Ensure above requirements are met.
git clone https://github.com/G-Wang/WaveRNN-Pytorch.git
cd WaveRNN-Pytorch
pip install -r requirements.txt
Before running scripts, one can adjust hyperparameters in hparams.py
.
Some hyperparameters that you might want to adjust:
fix_learning_rate
The model is robust enough to learn well with a fix learning rate of 1e-4
, I suggest you try this setting for fastest training, you can decrease this down to 5e-6
for final step refinement. Set this to None
to train with learning rate schedule insteadinput_type
(best performing ones are currently bits
and raw
, see hparams.py
for more details)batch_size
save_every_step
(checkpoint saving frequency)evaluate_every_step
(evaluation frequency)seq_len_factor
(sequence length of training audio, the longer the more GPU it takes)
This function processes raw wav files into corresponding mel-spectrogram and wav files according to the audio processing hyperparameters.
Example usage:
python preprocess.py /path/to/my/wav/files
This will process all the .wav
files in the folder /path/to/my/wav/files
and save them in the default local directory called data_dir
.
Can include --output_dir
to specify a specific directory to store the processed outputs.
Start training process. checkpoints are by default stored in the local directory checkpoints
.
The script will automatically save a checkpoint when terminated by crtl + c
.
Example 1: starting a new model for training
python train.py data_dir
data_dir
is the directory containing the processed files.
Example 2: Restoring training from checkpoint
python train.py data_dir --checkpoint=checkpoints/checkpoint0010000.pth
Evaluation .wav
files and plots are saved in checkpoints/eval
.