backstopmedia / tensorflowbook

457 stars 296 forks source link

The Linear Regression Example is Incorrect. #21

Closed zwu-net closed 7 years ago

zwu-net commented 7 years ago

The code https://github.com/backstopmedia/tensorflowbook/blob/master/chapters/04_ machine_learning_basics/linear_regression.py is not consistent with http://tempforum.neas-seminars.com/Attachment4373.aspx. Also please refer to http://stackoverflow.com/questions/41884411/puzzled-by-linear-regression-results-from-spark-ml-and-tensorflow/41899240#41899240.

Please fix the code, thanks.

arielscarpinelli commented 7 years ago

Thank you for the comment!.

Indeed there was a bug omitting to transpose the inference result before subtracting the expected value to calculate the loss. As a result it was making a broadcast difference operation and the calculated loss was not right.

Also to make it to converge to the analytical least squares result, it needed to add more training epochs.

This is the case because the stop condition in the training code was set to just have a fixed number of iterations, to simplify the model code. A common approach to overcome this issue is to make the stop condition to check the % of change in the loss, and stop when for a few iterations is below certain threshold, say 0.1%.

Given adding too many epochs would make the training slow, I also opted to make the learning rate bigger. But that would make the loss computing overflow. So I also opted to stick with the original formula for the loss and use the mean squared error, instead of just the sum of the square errors. From the optimization point of view should be the same, but given dividing but the number of examples makes the resulting loss value much lower, it allows to use a much bigger learning rate and achieve convergence with less training epochs.

Again thank you.