How to inference with unseen speakers?

yl4579 / StarGANv2-VC

StarGANv2-VC: A Diverse, Unsupervised, Non-parallel Framework for Natural-Sounding Voice Conversion

MIT License

466 stars 110 forks source link

How to inference with unseen speakers? #13

Closed Charlottecuc closed 2 years ago

Charlottecuc commented 2 years ago

Hi. The work is amazing. I notice that the inference.py file only support many-to-many conversion. Could you tell me how to modify it to any-to-many conversion and also singing voice conversion? Thank you very much.

yl4579 commented 2 years ago

The input is independent of speakers, so you can just feed in whatever speaker you want to convert, and the same goes for singing input. If you'd like to train your own model, you can simply use singing data and it will work for it as well. You may need to find a better vocoder for singing though because ParallelWaveGAN isn't very good for singing synthesis.