Open ngocanh2162 opened 2 years ago
Thank you for your quesstion.
When training, the attention weights of gst tokens is computed by the prosody embedding of the reference utterance(i.e. the input utterance).
When synthesizing, the attention weights is passed by this argument: atten_weights_ph
(means attention weigths placeholder
), which is computed offline by averaging attention weights of the top-K utterances of each emotion.
In file
modules/attention.py
line 434-435When I run inference, it stucks at this tensor. I cannot find any refer to this