I really appreciate the book, which is well focused on the underpinnings of deep learning and gets the reader up to speed very efficiently. A small detail, in the following code the variable weights_error seems to be redundant, its calculation is never used:
def backward(self, output_error):
input_error = np.dot(output_error, self.weights.T)
weights_error = np.dot(self.input.T, output_error)
# accumulate the error over the minibatch
self.delta_w += np.dot(self.input.T, output_error)
self.delta_b += output_error
self.passes += 1
return input_error
Instead, the same calculation is repeated for the accumulation of self.delta_w a couple of lines further. Computational inefficiency might easily be ignored in the educational setting. At least in my own case, however, linking the code to specific mathematical expressions outlined in the book, became less apparent.
One might want to re-write the code as self.delta_w += weights_error, for making the link (e.g. to eq. 10.13) more explicit. I imagine deviating from the book is undesirable at this point, so perhaps a comment could be added to the code on github to make the code flow more explicit?
I really appreciate the book, which is well focused on the underpinnings of deep learning and gets the reader up to speed very efficiently. A small detail, in the following code the variable
weights_error
seems to be redundant, its calculation is never used:Instead, the same calculation is repeated for the accumulation of
self.delta_w
a couple of lines further. Computational inefficiency might easily be ignored in the educational setting. At least in my own case, however, linking the code to specific mathematical expressions outlined in the book, became less apparent.One might want to re-write the code as
self.delta_w += weights_error
, for making the link (e.g. to eq. 10.13) more explicit. I imagine deviating from the book is undesirable at this point, so perhaps a comment could be added to the code on github to make the code flow more explicit?