Nice work indeed! The codes are quite well-written! May I ask two questions regarding SGD optimizer in LSTM experiments please?
(1) In the experiments, is there any specific reason to switch SGD optimizer to ASGD optimizer? I did not catch any related information in your paper about that.
(2) Should you use the validation dataset instead of test dataset when deciding if to switch to ASGD?
Hi Juntang,
Nice work indeed! The codes are quite well-written! May I ask two questions regarding SGD optimizer in LSTM experiments please?
(1) In the experiments, is there any specific reason to switch SGD optimizer to ASGD optimizer? I did not catch any related information in your paper about that.
(2) Should you use the validation dataset instead of test dataset when deciding if to switch to ASGD?
Thanks for your precious time.
Best,