syang1993 / gst-tacotron

A tensorflow implementation of the "Style Tokens: Unsupervised Style Modeling, Control and Transfer in End-to-End Speech Synthesis"
368 stars 110 forks source link

What is in reference audio path? #42

Open Thien223 opened 4 years ago

Thien223 commented 4 years ago

Dear @syang1993. Thank you for your efforts and the kindness of sharing with us.

I have 1 question about this project. Hit me if i was wrong.

As I am understanding, when training, reference mel will be target melspectrogram. And when synthesizing, we need pass the reference audio path. I could not understand which are in that path? Reference mel-spectrograms of all type of audio (angry, happy, sadness...) or just one type of them, or just 1 mel spectrogram. Are they exported numpy array (*.npy)?

Thank you so much. and thank you again.