Hi, I have a question about the learning rate in the example "word_language_model",
the init lr = 20, which seems very large, can you tell me why lr is set to equal 20?
Thanks a lot!
If you have some advices about improving the performance, please let me know and thanks
Hi, I have a question about the learning rate in the example "word_language_model", the init lr = 20, which seems very large, can you tell me why lr is set to equal 20? Thanks a lot! If you have some advices about improving the performance, please let me know and thanks