Open tomsonsgs opened 7 years ago
yes, i also observe this problem. Do you have some thoughts about how to solve it?
I have switched the initialization method from "He et al " to Xavier, the accuracy immediately increased a lot.
And i found that the configs have some differences from the original paper, such as learning rate.
@michael-wzhu I have got the same problem. Would you please specify how did you modify the code? and what accuracy for task1 did you obtain after the modification please?