rhasspy / piper

A fast, local neural text to speech system
https://rhasspy.github.io/piper-samples/
MIT License
4.37k stars 297 forks source link

Piper for Voice Conversion #448

Open slimsushi opened 1 month ago

slimsushi commented 1 month ago

Hi,

I need a fast direct audio to audio - Voice Conversion Model which I can finetune with my custom data in german. I found the voice_conversion.py script where piper seems to be used for direct voice conversion.

Now my question: Can I use any of the Checkpoints saved here: https://huggingface.co/datasets/rhasspy/piper-checkpoints/tree/main

When I try using the German checkpoint it caused an Error. Do I need to do any kind of preprocessing/preparing of the model before using it in the script? Or do you have specific pretrained checkpoints for voice conversion? Unfortunately I couldn't find any hints/tutorials or readmes in using Piper for voice conversion. Also when training Piper for voice conversion with a new voice, does it differ from training for tts?

Thank you for your help!