lipiji / TranSummar

Transformer for abstractive summarization on cnn/daily-mail and gigawords
MIT License
140 stars 20 forks source link

optimizer #9

Closed sIncerass closed 4 years ago

sIncerass commented 5 years ago

Hi @lipiji , thanks for the implementation. One thing I am wondering is that you are using Adagrad instead of Adam (w/ warmup) to have the scores?

lipiji commented 4 years ago

@slncerass Yes. Adam is also workable and may obtain better performance with warmup and better lr scheduler.