Closed fungus75 closed 1 year ago
If necessary, the file refence.wav could be downloaded here: https://drive.google.com/drive/folders/1SbMCLyD3YDie6lVl9lPgEdxZA9U0ZnJm?usp=sharing
I'm curious on the final result as it's based on my voice dataset 🙂.
If necessary, you can download the model (Checkpoint) and config.json from here: https://drive.google.com/drive/folders/1bU9ObB1Z30VoT5miTXEW2bDW1EODw-gr?usp=sharing
Hi @fungus75,
Looks like you have trained YourTTS only with Thorsten Dataset which is a single-speaker dataset. In this way, You will not be able to voice conversion with a good performance.
What is broken in your inference is that for voice conversion you need to provide the speaker_wav and the reference_wav. You can see some instructions here: https://github.com/Edresson/YourTTS#voice-conversion
Hello. Tell me, I have the same problem, how did you solve it?
You have to train a multi-speaker dataset. Than you can use the reference.wav. I trained a single-speaker only. As soon as I trained a multi-speaker, it worked.
Describe the bug
I trained a YourTTS model with the Thorsten Dataset (downgraded to 16000). Except for the problem described in https://github.com/coqui-ai/TTS/issues/2391 the training and voice generation worked perfectly.
But now I wanted to use reference_wav for voice conversion and this throws an error.
To Reproduce
I started that script in exactly the same environment I have created the model:
---cut-- from TTS.utils.synthesizer import Synthesizer MODEL_PATH="best_model.pth" CONFIG_PATH="config.json" OUT_PATH="." s = Synthesizer(MODEL_PATH,CONFIG_PATH,use_cuda=True) wav=s.tts("Hallo ich bin Eric und wie geht es euch?",reference_wav="reference.wav") s.save_wav(wav,os.path.join(OUT_PATH,"test.wav")) ---cut--
Expected behavior
Saves the file test.wav in the given folder, but crashes
Logs
Environment
Additional context
No response