Tacotron2 Pre-training have difficulties

TensorSpeech / TensorFlowTTS

:stuck_out_tongue_closed_eyes: TensorFlowTTS: Real-Time State-of-the-art Speech Synthesis for Tensorflow 2 (supported including English, French, Korean, Chinese, German and Easy to adapt for other languages)

https://tensorspeech.github.io/TensorFlowTTS/

Apache License 2.0

3.85k stars 815 forks source link

Tacotron2 Pre-training have difficulties #782

Closed gyu-bbang closed 1 year ago

gyu-bbang commented 1 year ago

Hello, I am a student who is learning with the Tacotron2 Kss dataset.

If you proceed with Tacotron2 Kss pre-training 120k and check the results through the tensor board, the following result values are given.

The loss percentage in the "val" section tends to be higher and higher.

If you pull it out as a wav file, the sound quality is indistinguishable.

I'd like to ask for your advice on this matter. Screenshot from 2022-12-20 16-09-30