Closed 980202006 closed 2 years ago
It uses the exact same architecture but is trained on a different dataset. You simply need to change the dataset and will get emotion conversion for free, as training is completely unsupervised. The emotions or styles are learned automatically.
What an excellent discriminator design, thank you!
Hello, on the sample page, I saw the audio from Neutral for emotional, and was surprised by this effect because it did two things: emotion conversion and voice conversion. Are two style encoders used? And whether there are any changes in training and model structure compared to the original model?