Closed wenhaozheng-nju closed 5 years ago
@wenhaozheng-nju great point, I'll implement is ASAP. Thank you 👍 Please let me know if you have any more request!
@wenhaozheng-nju if you can implement this functionality, can you make a code and pull request? I have no experience on this kind of things :) please let me know your thought
You could refer to this implement:
Nice implements! However, I have a question about learning rate. The learning_rate schedule which from the origin Transformers is warm-up restart, but your implement just simple decay. Could you implement it in your BERT code?