NVIDIA / waveglow

A Flow-based Generative Network for Speech Synthesis
BSD 3-Clause "New" or "Revised" License
2.29k stars 530 forks source link

Issue training for male dataset #245

Open zshakeri opened 3 years ago

zshakeri commented 3 years ago

I have been trying to train the model for a male dataset. I've tried training from scratch and finetuning the provided checkpoint. I tried with the default parameters (batchsize 3 - 8GPUs) and increasing batch size to 32 on 8 GPUs and playing around with the lr. In all cases, the error saturates to -5 around 5k-20k steps and then either increases or blows up. Do you have any suggestions what to do in this case? Have you trained the model for any dataset other than LJ? Examples of training loss curves: Screen Shot 2020-12-16 at 10 38 57 AM

Screen Shot 2021-01-07 at 11 47 22 AM
rafaelvalle commented 3 years ago

Try weightdecay and clipping the norm of the gradients/

On Thu, Jan 7, 2021 at 11:48 AM Zahra S notifications@github.com wrote:

I have been trying to train the model for a male dataset. I've tried training from scratch and finetuning the provided checkpoint. I tried with the default parameters (batchsize 3 - 8GPUs) and increasing batch size to 32 on 8 GPUs and playing around with the lr. In all cases, the error saturates to -5 around 5k-20k steps and then either increases or blows up. Do you have any suggestions what to do in this case? Have you trained the model for any dataset other than LJ? Examples of training loss curves: [image: Screen Shot 2020-12-16 at 10 38 57 AM] https://user-images.githubusercontent.com/58200907/103936684-95bab300-50dc-11eb-98ce-8eece0745a58.png [image: Screen Shot 2021-01-07 at 11 47 22 AM] https://user-images.githubusercontent.com/58200907/103937794-204fe200-50de-11eb-81fe-5dba19ed0972.png

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/NVIDIA/waveglow/issues/245, or unsubscribe https://github.com/notifications/unsubscribe-auth/AARSFD7TY774G6RGBNWM7ATSYYFXHANCNFSM4VZN6PFA .