Question about SGD optimizer in LSTM experiments

juntang-zhuang / Adabelief-Optimizer

Repository for NeurIPS 2020 Spotlight "AdaBelief Optimizer: Adapting stepsizes by the belief in observed gradients"

BSD 2-Clause "Simplified" License

1.05k stars 109 forks source link

Question about SGD optimizer in LSTM experiments #54

Closed yunfei-teng closed 3 years ago

yunfei-teng commented 3 years ago

Hi Juntang,

Nice work indeed! The codes are quite well-written! May I ask two questions regarding SGD optimizer in LSTM experiments please?

(1) In the experiments, is there any specific reason to switch SGD optimizer to ASGD optimizer? I did not catch any related information in your paper about that.

(2) Should you use the validation dataset instead of test dataset when deciding if to switch to ASGD?

Thanks for your precious time.

Best,

juntang-zhuang commented 3 years ago

Hi, thanks for the question. Actually I have no idea what is ASGD, the code is just forked from https://github.com/salesforce/awd-lstm-lm/blob/master/main.py Perhaps you can ask the question there.