Open shahuzi opened 5 years ago
I have also a 5h dataset. I have tried to train with r=1 and batch_size=48 on 2 GPUs, but haven't got any alignment after 100K steps. With r=2, good alignment already after 20K steps.
@tugstugi
Thanks for your reply, so it seems that batch_size
is not the main reason. So do you have any explanation or personal intuition for this phenomenon(r=1
got bad alignment)?
see also: https://github.com/Rayhane-mamah/Tacotron-2/issues/175
I will resume the training until it reaches 200K and report the result here :)
see also: #175
I will resume the training until it reaches 200K and report the result here :)
thks!
See https://github.com/Rayhane-mamah/Tacotron-2/issues/342 That might very useful for you to judge quick alignment or not in early period. The gradient is stochastic. If you do not find the plot right at 1~2K steps, just try more times of training from scratch.
@begeekmyfriend have you got any alignment with r=1
?
@tugstugi I just got OOM exception for this setting.
After 200K steps, still no alignment. Tried now again with r=2, got an alignment after 20K.
@tugstugi Thanks for your reply, and after a series of tring such as modify the dropout_rate
, adjust the input embedding dims. I decided to give up to use r=1 and will try r=2.
@shahuzi I am new to Tacotron2, please I want to know where I can found r? in hparams.py ? I fact I have the same problem discussed in this issue
@begeekmyfriend Hi, I noticed that you mentioned in #161 that you never got alignment with
r=1
andbatch_size=24
. I think I have the same problem now, I have a dataset with about 5 hours (about 3000 sentences), and for the purpose of try to update gradients more frequently, I set thebatch_size=8
, but it can't alignment well, here is one alignment during the synthesis.And the synthesized wav sound not intelligible at all. But when I set the
outputs_perstep=3
, the result is more normal relatively.So, is this issue caused by the small setting of
batch_size
?