Closed TutajITeraz closed 1 year ago
You need at least 2 speakers to train a voice conversion model (if that was your problem).
So if I understand it correctly - if i want to perform a voice style transfer - i should train PitchExtractor on my voice, and voice i want to clone?
Does the two voices have to say the same phrases?
Hello! I'm begginer hobbyst in data science. I have trained PitchExtractor and ASR using my voice, so i have .pth files
When I'm trying to replace it with my file eg.
torch.load('Models/lukasz_pe_epoch_0150.pth')['model']
("lukasz_pe_epoch_0150.pth" is the output of the PitchExtractor.)The resulting voice sounds almost the same like original speaker Can someone help me with this?