Closed mudabek closed 4 months ago
This repo does not contain a synthesis module that converts from input features to speech. That's coming soon (in a different repo). I'll keep this issue open until then.
Hi! I I am grateful for the work and also wondering if the voice conversion repo is still coming.
Yes, the repo containing VC is still coming. I do not have a fixed release date, but somewhere in the range of 2-6 months is probably correct.
The synthesis method in our paper "High-Fidelity Neural Phonetic Posteriorgrams" is relatively naive and not intended to be SOTA VC (yet). If I release the VC now, researchers would use that to discredit our entire PPG representation in favor of other representations. Once released, the VC repo will contain the additional details needed to finish the story.
Stay tuned! =)
Hi @maxrmorrison could you provide some hints on how you have used the PPG features to train VC? :) Would be interested if we would be able to also model durations using the PPG representations, for a varying durations VC.
Our follow-up paper is out! The system it describes is amenable to high-fidelity voice conversion with reconstruction quality on-par with Mel vocoding. Code coming soon =)
If you'd like even more details, I provide the whole story (and more VC examples) in my thesis
Code is released! Thanks for your patience
Hey! Thank you for sharing your work!
In the demo there is an example of voice conversion, but I couldn't find voice conversion function in the repo. Is it possible for you to share the way you ran voice conversion?