codertimo / BERT-pytorch

Google AI 2018 BERT pytorch implementation
Apache License 2.0
6.11k stars 1.29k forks source link

The question about the implement of learning_rate #17

Closed wenhaozheng-nju closed 5 years ago

wenhaozheng-nju commented 5 years ago

Nice implements! However, I have a question about learning rate. The learning_rate schedule which from the origin Transformers is warm-up restart, but your implement just simple decay. Could you implement it in your BERT code?

codertimo commented 5 years ago

@wenhaozheng-nju great point, I'll implement is ASAP. Thank you 👍 Please let me know if you have any more request!

codertimo commented 5 years ago

@wenhaozheng-nju if you can implement this functionality, can you make a code and pull request? I have no experience on this kind of things :) please let me know your thought

wenhaozheng-nju commented 5 years ago

You could refer to this implement:

https://github.com/jadore801120/attention-is-all-you-need-pytorch/blob/20f355eb655bad40195ae302b9d8036716be9a23/transformer/Optim.py#L4