seungwonpark / melgan

MelGAN vocoder (compatible with NVIDIA/tacotron2)
http://swpark.me/melgan/
BSD 3-Clause "New" or "Revised" License
638 stars 116 forks source link
gan neural-vocoder pytorch tts

MelGAN

Unofficial PyTorch implementation of MelGAN vocoder

Key Features

Prerequisites

Tested on Python 3.6

pip install -r requirements.txt

Prepare Dataset

Train & Tensorboard

Pretrained model

Try with Google Colab: TODO

import torch
vocoder = torch.hub.load('seungwonpark/melgan', 'melgan')
vocoder.eval()
mel = torch.randn(1, 80, 234) # use your own mel-spectrogram here

if torch.cuda.is_available():
    vocoder = vocoder.cuda()
    mel = mel.cuda()

with torch.no_grad():
    audio = vocoder.inference(mel)

Inference

Results

See audio samples at: http://swpark.me/melgan/. Model was trained at V100 GPU for 14 days using LJSpeech-1.1.

Implementation Authors

License

BSD 3-Clause License.

Useful resources