Open xujzouyyz opened 6 months ago
after 25 epochs ?
What's your batch size/data ? There's a possibility that you start the TMA stage of the training ? (it should be in your config file).
I am facing a similar issue. I am trying to reproduce the results of the paper and training on LJSpeech with a single GPU. As soon as the training starts the TMA stage, within 1-2 epochs the Gen and Dis loss start blowing up and eventually they NaN. I am using a batchsize of 16 and a learning rate of 1e-4. This is in the first stage of training.
Can you let me know how to stabilize this part of the training?
Perhaps issue https://github.com/yl4579/StyleTTS2/issues/254 as well as its connected PR https://github.com/yl4579/StyleTTS2/pull/253 could solve this - it did solve NaN value errors for me, although it was for 2nd stage training on a single GPU.
I tried to train the first stage using the LJSpeech dataset provided by developer, with the Config file set as default. However, mel loss decreases to 0.5 and becomes NaN after 25 epochs. How does this happen?