roedoejet / FastSpeech2

An implementation of Microsoft's "FastSpeech 2: Fast and High-Quality End-to-End Text to Speech"
MIT License
22 stars 7 forks source link

Duration loss stuck at 0 when use_energy_predictor is turned off #6

Closed jzhu709 closed 1 year ago

jzhu709 commented 1 year ago

Just moving our email conversation to GitHub so that everyone can see the issue (and hopefully the fix!).

When running the train.py script, the output log files result in a duration loss of 0 when the use_energy_predictor config in model.yaml is set to false. image

I am using the LJSpeech dataset provided here https://keithito.com/LJ-Speech-Dataset/ and the TextGrids provided here https://drive.google.com/drive/folders/1DBRkALpPd6FL9gjHMmMEdHODmkgNIIK4, retrieved from the README file.

This also affects any new languages trained as long as the use_energy_predictor config is set to false. (but recommended in your paper since you find it better for low resourced languages).

roedoejet commented 1 year ago

Thanks for moving this here Josh. Oh dear, I think I see an issue here with the log. I fixed it here: https://github.com/roedoejet/FastSpeech2/commit/aabe8c26292ecf836979b0e7b1c33cb241f7b77a

My sincere apologies, as I know this has caused some headaches. Regarding the other issue that you mentioned over email, I will post it in a new issue.