Closed houzhijian closed 4 years ago
I think it's mainly about the hyper-parameters. I spent more time finding the best hyper-parameters (e.g. learning rate). Also, different machines and libraries also affect the numbers a little bit.
Got it. Thank you for your response.
Hi, steve: