Open hunterlew opened 7 years ago
Just to jump in the comments section. It seems that the learning rates were decide after empirical tests. My own finding are that the smaller the dataset, the higher the learning rate required so that the errors converge. @zhanghang1989 Would be able to comment better on this.
I've tried cifar-10 experiments several times and found that different learning rate and step have great influence on the final result, fluctuating about 0.1~0.2%. How can I address it? Hopefully someone give me suggestions!