Ch 4: batch gradient descent demonstration with various learning rates

genemishchenko commented 5 years ago

Hi Aurelien.

May I modestly suggest the attached implementation of the plot_gradient_descent() function? I think it drives home your point:

A simple solution is to set a very large number of iterations but to interrupt the algorithm when the gradient vector becomes tiny...

It also demonstrates how to actually calculate the magnitude of a vector.

Lastly, I also took the liberty of plotting the final predictions - may be good for the illustrative purposes, since it won't show up on the third plot, which corresponds to the LR that is too high.

Cheers! Gene.

plot_gd.txt

ageron commented 5 years ago

Wow thanks Gene, that's very kind of you! I'll check it out asap. 👍

yashGuleria commented 5 years ago

Hi, Since this issue is for Chapter 4, I am writing my query here. while using the normal equation, why do we need to convert the "X" to "X_b" using np.c? I mean what is the logic behind it?

genemishchenko commented 5 years ago

@yashGuleria, In X_b there's 1 added at the beggining of each instance's vector to get the bias/intercept term in the result, otherwise it will not appear in the result of the closed form solution (i.e. the Normal Equation). Thanks. Gene.

yashGuleria commented 5 years ago

@genemishchenko, Thank you for helping out. regards yash

ageron / handson-ml

Ch 4: batch gradient descent demonstration with various learning rates #446