Open AIROBOTAI opened 6 years ago
Thanks!
That's a good question, and honestly I am not sure I completely know why! From the original MobileNets paper, it seems they use hyperparameters from the Inception papers, whereas I tried with the recent Nasnet paper hyperparameter (much large learning rate of ~0.2), which seem to give much better accuracy. I got the same good results with the MobileNets v2 models (and here, the reported numbers in the papers are pretty close as well).
Wow, that's a big discovery! This is a strong evidence that how important hyperparameters are in DL :-D Thanks for your explanation!
Hi @balancap, I'd like to run your souce codes for training MobileNet-v1/v2. I guess the training command should be python tf_cnn_benchmarks.py
followed by hyperparameter settings. Could you please show me the list of hyperparameters you use? Or do you just follow Nasnet? Thanks!
Hi @balancap, could you please share more details of hyperparameters? Thanks a lot!
@balancap Do you mean that you were training MobileNet v1 using learning rate ~0.2 which achieved a better accuracy than original paper?
My learning rate was set to ~0.05, I tried 6000 to 10000 steps and only got 68% top-1 accuracy. Is it a problem of the small learning rate?
Thanks for sharing your great work!
The MobileNet-v1 you trained achieved 72.9 top-1 acc. which surpasses the reported number (70.6) in original paper by a large margin. Could you please explain the reasons? Thanks!