Open lyz04551 opened 5 years ago
For the T2+griffin_lim synthesis long sentence, there are several questions I would like to ask you:
- Is max_iters only used in the synthesis step? So what is the length (mel or linear length) that affects the decoder output during the synthesis step? Stop_token? My current problem is for long sentence input, I set max_iters to 4000, and the synthesized audio is followed by echo. 2.When the length of the input text is normal, the linear length of the output is normal, and when the length of the input text is long, the linear length dimension of the output is consistent with max_iters? What is the actual cause? Did you train a good stop_token?
Hi there @lyz04551 have you solve the issue? I faced the same thing. I tried many ways but still did not solve? What would be your solution ya?
For the T2+griffin_lim synthesis long sentence, there are several questions I would like to ask you: