Sharing hyper-parameters (learning rate settings)

tensorflow / benchmarks

A benchmark framework for Tensorflow

Apache License 2.0

1.14k stars 629 forks source link

Sharing hyper-parameters (learning rate settings) #254

Open ShawnDing1994 opened 5 years ago

ShawnDing1994 commented 5 years ago

Would really appreciate it if the successful hyper-parameters were provided, i.e., how the learning rate should be initialized and decayed. I re-implemented ResNet-50 following the example. But I am trying Mobilenet and have no idea about adjusting the learning rate. Don't know if the learning rate setting (0.045, decay by 0.98 every 2.5 epochs) in slim.model would work due to the difference in implementations.

reedwm commented 5 years ago

In tf_cnn_benchmarks, only resnet50 is regularly tested to converge to the corresponding paper's accuracy, currently. Other models are typically untested and so are currently likely have convergence issues.

/CC @mingxingtan, has mobilenet been trained to convergence in tf_cnn_benchmarks and if so, what were the hyperparameters?