A series of Jupyter notebooks that walk you through the fundamentals of Machine Learning and Deep Learning in Python using Scikit-Learn, Keras and TensorFlow 2.
Apache License 2.0
7.96k
stars
3.2k
forks
source link
[QUESTION] Chapter 4 Exercise Question 12 - cost function with l2 regularization seems incorrect #118
When attempting the question, there is a bonus part to add l2 regularization to the softmax regression code (In [75]):
According to the book, in the section about Ridge Regression, we are supposed to add ($\dfrac{\alpha}{m}$ * sum of thetas) to the original cost function. However in line 2 in the picture above, l2_loss is somehow calculated with 1/2 multiplied at the front. Shouldn't it be 1/m instead?
According to the same section of the book, we should add $2\alpha w / m$ to the MSE gradient vector. So in line 3 of the picture above, shouldn't it be 2 * alpha * Theta[1:] / m instead?
Maybe this is why the validation loss suddenly increased a lot when the regularization is applied.
If this is indeed a typo, the bottom sections involving the hyperparameter C also has to be changed.
When attempting the question, there is a bonus part to add l2 regularization to the softmax regression code (
In [75]
):According to the book, in the section about Ridge Regression, we are supposed to add ($\dfrac{\alpha}{m}$ * sum of thetas) to the original cost function. However in line 2 in the picture above,
l2_loss
is somehow calculated with 1/2 multiplied at the front. Shouldn't it be 1/m instead?According to the same section of the book, we should add $2\alpha w / m$ to the MSE gradient vector. So in line 3 of the picture above, shouldn't it be
2 * alpha * Theta[1:] / m
instead?Maybe this is why the validation loss suddenly increased a lot when the regularization is applied.
If this is indeed a typo, the bottom sections involving the hyperparameter
C
also has to be changed.