seungwonpark / melgan

MelGAN vocoder (compatible with NVIDIA/tacotron2)
http://swpark.me/melgan/
BSD 3-Clause "New" or "Revised" License
633 stars 116 forks source link

Use dynamic learning rate decay for convergence #39

Open begeekmyfriend opened 4 years ago

begeekmyfriend commented 4 years ago

The evaluation sounds better than that with fixed learning rate.

Signed-off-by: begeekmyfriend begeekmyfriend@gmail.com

seungwonpark commented 4 years ago

Hi, your code looks great, and thanks for kindly sending PR! Can you please show the audio samples you got (w/ the number of epochs) for comparison?

begeekmyfriend commented 4 years ago

melgan_eval_mandarin.zip I have synthesized voices from 4 anchors (1 male and 3 females). And the checkpoint is only at epoch 375 and still under training. I think it helps convergence.

bob80333 commented 4 years ago

Is this different from pytorch's built-in CosineAnnealingLR?

Liujingxiu23 commented 3 years ago

@begeekmyfriend
Your tried different lr and found that the cos-lr is the best? And why it is suitable in melgan? I am confused, when should we use unchanged lr, when should we use lr with decline,for example exponential decline in tacotron, and when shoud we use cos-lr.

begeekmyfriend commented 3 years ago

It is just a preference. Pick it or other as you like.

Liujingxiu23 commented 3 years ago

@begeekmyfriend Thank you for your quick reply. I used your branch of tacotron and found it is one of the best among a lot of code branchs. I will try the cos-lr as well as the apex in tfgan.