Open hilmandayo opened 7 years ago
I am referring to your exercise 4 code right now to complete mine. I think you've made a mistake on regularizing the NN backpropagation gradient (if I am wrong, pardon me). This is the equation:
And this is your code:
delta1 = d2.dot(a1) # 25x5000 * 5000x401 = 25x401 delta2 = d3.T.dot(a2) # 10x5000 *5000x26 = 10x26 theta1_ = np.c_[np.ones((theta1.shape[0],1)),theta1[:,1:]] theta2_ = np.c_[np.ones((theta2.shape[0],1)),theta2[:,1:]] theta1_grad = delta1/m + (theta1_*reg)/m theta2_grad = delta2/m + (theta2_*reg)/m
Shouldn't it be
theta1_ = np.c_[np.zeros((theta1.shape[0],1)),theta1[:,1:]] theta2_ = np.c_[np.zeros((theta2.shape[0],1)),theta2[:,1:]]
since we do not want to add anything to theta1_grad's and theta2_grad's first column (the bias)?
theta1_grad
theta2_grad
I also have this question.
I am referring to your exercise 4 code right now to complete mine. I think you've made a mistake on regularizing the NN backpropagation gradient (if I am wrong, pardon me). This is the equation:
And this is your code:
Shouldn't it be
since we do not want to add anything to
theta1_grad
's andtheta2_grad
's first column (the bias)?