Open Bardo-Konrad opened 4 years ago
Hi, did you use a pretrained model? Which version of the repo are you using (which commit)? It might be samples from an older model file than the most recent pretrained model available.
edit: I noticed you added with stress and normalizer to the config. These normally are in the data_config.yaml. Also, I don't think that stress is available with models trained with the wavernn preprocessing, so you might be using a recent version (commit) of the repo instead of the right one. Please checkout the commit next to the model file in the table. Also, I discourage using the autoregressive model: it is by nature unstable (in order to make it work I had to add a comma in the sentence after "trump", (which is not there anymore due to comparison with ForwardTacotron)) and has "noise" injected in the decoder, hence results will vary varying the random seed.
edit: listening to the audio, it might actually be that youre phonemizing with stress, but you shouldnt.
Also, if you're interested in replicating the results using our pretrained models, you can just try the Colab Notebooks.
Also, if you're interested in replicating the results using our pretrained models, you can just try the Colab Notebooks.
I got only two warning messages
WARNING: could not retrieve git hash. Command '['git', 'describe', '--always']' returned non-zero exit status 128.
WARNING: could not check git hash. Command '['git', 'describe', '--always']' returned non-zero exit status 128.
restored weights from ljspeech_melgan_forward_transformer/melgan/forward_weights/ckpt-179 at step 895000
I changed the sentence to 'Hello, how are you?'
I got Herunterladen.zip.
Sounds as bad as my sample, yet this time I followed the colab.
Which improvement do you suggest, now that we see the exact same approach?
Sounds fine to me. This is inverted with Griffin-Lim algo, sound quality is expected to be low. You need to follow the next steps in the notebook and convert it using with the vocoder.
My predict.py:
I changed Architecture of autoregressive_config.yaml to
And I got the attached wave file PresidentTrumpmetwithotherleadersattheGroupoftwentyconference.zip. Not the same as https://as-ideas.github.io/TransformerTTS/.
What is missing?