How to achieve style embedding with different weights of each token without reference audio?

syang1993 / gst-tacotron

A tensorflow implementation of the "Style Tokens: Unsupervised Style Modeling, Control and Transfer in End-to-End Speech Synthesis"

368 stars 110 forks source link

Closed bitwangyujia closed 5 years ago

jdosoriopu commented 5 years ago

hi, did you figure this out?

niu0717 commented 4 years ago

I want to know the same thing. thx!