Question about WaveGlow

I have trained WaveRNN and got good results. Since WaveGlow's inference speed on GPU is faster than WaveRNN, so I want to train a WaveGlow vocoder.

For your WaveGlow repository, I made the following modifications to make training possible:

Move the quant and mel data to it; (Data used to train WaveRNN)
Change config.json mel_pad_val: -5.0; (I set voc_pad_val=-5.0 when training WaveRNN)
Add z = z.type(torch.cuda.HalfTensor) before glow.py-L116 to solve the training error of input type (torch.cuda.ShortTensor) and weight type (torch.cuda.HalfTensor) should be the same.

After 13 hours of training, 21 epochs, logs is: 屏幕截图 2021-01-13 11:04:40

When using --is_fp16 to inference, the wav result is all silent. If not using --is_fp16 to inference, the wav result is all noise.

Did I do something wrong? Could you give me some suggestions?

Maybe the training is not enough. Now 24 hours, 24K steps, 40 epochs. Sometimes still will have warning of Gradient overflow. Skipping step, loss scaler 0 reducing loss scale to 1024.0

begeekmyfriend / tacotron2

Question about WaveGlow #35