nfmcclure / tensorflow_cookbook

Code for Tensorflow Machine Learning Cookbook
https://www.packtpub.com/big-data-and-business-intelligence/tensorflow-machine-learning-cookbook-second-edition
MIT License
6.23k stars 2.41k forks source link

GradientDescentOptimizer example is very sensitive to initial seed for A and learning rate #76

Open TheQuant opened 7 years ago

TheQuant commented 7 years ago

I've been getting inconsistent results with the Deming regression example given in your text. I'm running Tensorflow locally on an iMac using MacOS Sierra 10.12.4. I then explored using different combinations of initial values for A (rather than the random normal example in the text), and then different learning rates. I found that starting values for A that were negative often led to poor fits (lines with negative slopes and large intercepts, suggesting gradients that were moving in the wrong direction -- obviously diverging from what the raw data would otherwise suggest on inspection), even though the loss function (or Deming distance) improved throughout the optimization given sufficient iterations.

I understand the impact that learning rate has on convergence as well, but wondered whether you could suggest a good way to determine the initial value for the variables and the learn rate. Also, what the best way to determine if the optimization is achieving reasonable results? I've looked at examining the actual gradient calculations during iteration, but thought the must be a better way...

nfmcclure commented 7 years ago

I'll take a look into this over the weekend. I'm sure there's a better way to estimate some starting points based on the data.