CookiePPP / cookietts

[Last Updated 2021] TTS from Cookie. Messy and experimental!
BSD 3-Clause "New" or "Revised" License
43 stars 8 forks source link

TorchiMoji Usage and Style Transfer #31

Open kannadaraj opened 3 years ago

kannadaraj commented 3 years ago

Thanks a lot for sharing such awesome package. I have a query regarding the TorchMoji embedding being used. Using TorchMoji and Emotionnet is an interesting combination. Semi Supervised Emotional VAE module will learn the latent space of any of the emotions and will help in projecting any text to any latent space during inference. For e,g, We can make an internally sad sentence spoken in a very happy mode by choosing appropriate latent variables. But if we also use TorchMoji embedding, this is approximately literal representation of emotion in text. Hence style transfer is affected adversely by the text embedding that points a different emotion. So wont the two approach actually work against each other?

Please can you elaborate of the interaction or combined effects on training and synthesis. Thanks a lot.