ga642381 / FastSpeech2

Multi-Speaker Pytorch FastSpeech2: Fast and High-Quality End-to-End Text to Speech :fist:
93 stars 16 forks source link
fastspeech2 melgan multi-speaker-tts pytorch text-to-speech tts waveglow

Multi-speaker FastSpeech 2 - PyTorch Implementation :zap:



Datasets :elephant:

This project supports 2 muti-speaker datasets:

:fire: Single-Speaker

:fire: Multi-Speaker

Config

Configurations are in:

Please modify the dataest and mfa_path in hparams.

In this repo, we're using MFA v1. Migrating to MFA v2 is a TODO item.

Steps

  1. preprocess.py
  2. train.py
  3. synthesize.py

1. Preprocess

File Structures:

[DATASET] / wavs / speaker / wav_files [DATASET] / txts / speaker / txt_files

Example commands:

python preprocess.py /storage/tts2021/LJSpeech-organized/wavs /storage/tts2021/LJSpeech-organized/txts ./processed/LJSpeech --prepare_mfa --mfa --create_dataset


* LibriTTS:
``` shell 
python preprocess.py /storage/tts2021//LibriTTS/train-clean-360 /storage/tts2021//LibriTTS/train-clean-360 ./processed/LibriTTS --prepare_mfa --mfa --create_dataset

2. Train

Example commands:

Example commands:

python synthesize.py --ckpt_path ./records/LJSpeech_2021-11-22-22:42/ckpt/checkpoint_125000.pth.tar --output_dir ./output

References :notebook_with_decorative_cover: