DanRuta / xVA-Synth

Machine learning based speech synthesis Electron app, with voices from specific characters from video games
GNU General Public License v3.0
590 stars 54 forks source link

Using the program for other voices #15

Open StElysse opened 3 years ago

StElysse commented 3 years ago

Hello,

How can I use the FastPitch part of this program for a non-Bethesda game speaker?

I discovered this repo in my quest to create a personal modding project involving voice synthesis. Over the past week, I’ve successfully been fiddling around with the Real Time Voice Cloning repo to fine-tune their pretrained models to a single speaker. The model is continuing to improve slowly, but my sole gripe is the inability to control the pitch of the generated audios.

Would it be possible for you to tell me how I can modify or use your repo for a non-Bethesda speaker? I know how to compile datasets with LibriTTS and train a pre-made synthesizer on them, but not much else.

If you can help, I’d be grateful!

DanRuta commented 3 years ago

Hi there. xVA is just an app for doing inference (with editing) for FastPitch models. Your best bet is to head over to NVIDIA's github page for FastPitch, where they give instructions for how to train the model. Once you train a model on your data, you can drop it into xVA for using the editor.