jerivl / Deepcut

A robot that raps
Apache License 2.0
1 stars 1 forks source link

Training TTS for a different voice #10

Open jerivl opened 3 years ago

jerivl commented 3 years ago

Issue: Tacotron 2 only has pretrained voice.

Solution: We can train it further to change the voice that is used. Load the pytorch model and apply some gradient steps on a new input distribution (aka transfer learning).

Desired input: Pretrained model, training corpus for supervised learning (not sure what this entails exactly, see https://github.com/NVIDIA/tacotron2 and https://arxiv.org/abs/1712.05884). This may require us to create a dataset for training. A new issue will be posted with further details. I'm not sure what labels and audio features are required for their model.

Desired output: Trained model after some number of training epochs.