G-Wang / WaveRNN-Pytorch

Fatcord's Alternative WaveRNN (Faster training)
MIT License
126 stars 72 forks source link

WaveRNN-Pytorch

This repository contains Fatcord's Alternative WaveRNN (Faster training), which contains a fast-training, small GPU memory implementation of WaveRNN vocoder.

Model Pruning and Real Time CPU Inference

See geneing's awesome fork that has model pruning, export to C++ and real time inference on CPU: https://github.com/geneing/WaveRNN-Pytorch.

Highlights

Audio Samples

  1. Obama & Bernie Sanders See this repo in action!

  2. 10-bit audio on held-out testing data from LJSpeech. This model sounds and trains pretty close to 9 bit. We want the higher bit the better.

  3. 9-bit audio on held-out testing data from LJSpeech. This model trains the fastest (this is around 130 epochs)

  4. Single beta distribution on held-out testing data from LjSpeech. This is trained with the single Beta distribution.

Pretrained Checkpoints

  1. Single Beta Distribution trained for 112k. Make sure to change hparams.input_type to raw.
  2. 9-bit quantized audio trained for 11k, or around 130 epochs, can be trained further. Make sure to change hparams.input_type to bits.
  3. 10-bit quantized audio. To ensure your model is built properly, download the hparams.py here, either replace this with your local hparams.py file or note and update any changes.

Requirements

Installation

Ensure above requirements are met.

git clone https://github.com/G-Wang/WaveRNN-Pytorch.git
cd WaveRNN-Pytorch
pip install -r requirements.txt

Usage

1. Adjusting Hyperparameters

Before running scripts, one can adjust hyperparameters in hparams.py.

Some hyperparameters that you might want to adjust:

Example usage:

python preprocess.py /path/to/my/wav/files

This will process all the .wav files in the folder /path/to/my/wav/files and save them in the default local directory called data_dir.

Can include --output_dir to specify a specific directory to store the processed outputs.

3. Training

Start training process. checkpoints are by default stored in the local directory checkpoints. The script will automatically save a checkpoint when terminated by crtl + c.

Example 1: starting a new model for training

python train.py data_dir

data_dir is the directory containing the processed files.

Example 2: Restoring training from checkpoint

python train.py data_dir --checkpoint=checkpoints/checkpoint0010000.pth

Evaluation .wav files and plots are saved in checkpoints/eval.

WIP