yl4579 / StarGANv2-VC

StarGANv2-VC: A Diverse, Unsupervised, Non-parallel Framework for Natural-Sounding Voice Conversion
MIT License
466 stars 110 forks source link

Questions about One-to-Many #82

Closed roinhutovo closed 1 year ago

roinhutovo commented 1 year ago

Thanks for your work on this @yl4579 I have few questions:

  1. Is there anything should I notice or modify when doing one-to-many (exactly my voice to my friends voices)? I just want to use a unique source speaker for stability.
  2. In case I want to add another target speaker, may I just fine-tune the pre-trained model on this new speaker's dataset?
yl4579 commented 1 year ago

There's no way to do one-to-many only because the model relies on the cycle consistency loss to learn how to generate style vectors from either the mapping network or style encoder. Many-to-many already includes one-to-many, so I'm not sure about your reason for doing only one-to-many.

As for adding a new target speaker, you can fine-tune the old model with the new speaker's dataset, just by changing the num_domain to the old one plus 1.