yl4579 / StarGANv2-VC

StarGANv2-VC: A Diverse, Unsupervised, Non-parallel Framework for Natural-Sounding Voice Conversion
MIT License
466 stars 110 forks source link

How to train new voices ? How to use them? #65

Closed TutajITeraz closed 1 year ago

TutajITeraz commented 1 year ago

Hello! I'm begginer hobbyst in data science. I have trained PitchExtractor and ASR using my voice, so i have .pth files

When I'm trying to replace it with my file eg. torch.load('Models/lukasz_pe_epoch_0150.pth')['model'] ("lukasz_pe_epoch_0150.pth" is the output of the PitchExtractor.)

The resulting voice sounds almost the same like original speaker Can someone help me with this?

yl4579 commented 1 year ago

You need at least 2 speakers to train a voice conversion model (if that was your problem).

TutajITeraz commented 1 year ago

So if I understand it correctly - if i want to perform a voice style transfer - i should train PitchExtractor on my voice, and voice i want to clone?

Does the two voices have to say the same phrases?