Snowdar / asv-subtools

An Open Source Tools for Speaker Recognition
Apache License 2.0
597 stars 135 forks source link

[ Frank Discussion 1 ] Training Strategy #3

Open Snowdar opened 4 years ago

Snowdar commented 4 years ago

Welcome to discuss the training strategy here.

There are two typical training strategies, "SGD + Reduce Learning Rate on Plateau" and "Adam + Warm Restarts".

SGD + Reduce Learning Rate on Plateau

(1) Training slowly but could make a good generalization. (2) The parameters of ReduceLROnPlateau should be set carefully, such as patience and learning rate scale. ......

Adam + Warm Restarts

(1) It is not clear to set the T for Warm Restarts. (2) It is dizzy to make sure how many times the Restarts should be. ......

In fact, I am still not sure how the value of weight decay influences the results when training with these two strategies. And are there any other factors decide the final performance when comparing the two strategies?

Welcome to comment and share your experiments.