Closed Sangkikim-77 closed 4 years ago
The mel is used for the Global Style Tokens.
Closing due to inactivity.
Hi, I found that when synthesizing from Hallelujah musicxml, changing the mel input seems
The mel is used for the Global Style Tokens.
Hi, thanks a lot for your work! I am also curious about why need GST when synthesizing from musicxml. Actually, we do not have a mel when synthesizing from musicxml. If the task is to synthesize from musicxml, we do not need GST part during training right?
The Global Style Tokens can inject the style (screaming, whispering, etc...) in a mel-spectrogram while synthesizing speech. If you remove GST during training, you won't be able to take advantage of this.
When I synthesis using music score(musicXML), I have to use "mel" by input parameter. However, if you look at the code on inference.ipynb provided, the input parameter, Mel, is using the mel of the dataloader, which has nothing to do with Hallelujah of the Haendel.
Can I use any mel?