linear regression - Githubissues

ClarenceTeee commented 4 years ago

grad_w = -(y - y_pred).dot(X) + self.regularization.grad(self.w)

in regression.py

should it be grad_w = -(y - y_pred).dot(X) * (1/training_size) + self.regularization.grad(self.w) ?

amithadiraju1694 commented 4 years ago

@ClarenceTee93 I have the same question, hope @eriklindernoren could shed some light on it. I've been long following the Hands On Machine Learning with ... by Aurelien Geron .. and the equation used for Batch Gradient Descent in that book is:

2 / training_size ( X_b.T.dot( X.dot(theta) - y ) ) ; this could be re-written as 2/m ( X_b.T.dot( ypred - y ))

Even if assuming that the, X used in the equation from @eriklindernoren , already has a bias term included for each sample, and switch from ypred - y , to - (y - ypred ) makes sense, the multiplicative factor must be included in the equation, to the best of my knowledge, the math checks out , if we carefully differentiate (ypred - y ) ^ 2 w.r.t each parameter.

maxinat0r commented 2 years ago

Yes it should be

eriklindernoren / ML-From-Scratch

linear regression #74