Closed SPDA36 closed 1 year ago
Thanks for the note, and that's a good point. At first glance it may look wrong. But we have errors = (y - output)
, and that's the negative gradient.
That's because -(y - output)
is the gradient as shown at the bottom of pg 38. And -1 * -(y - output)
simplifies to (y - output)
.
The obvious was staring at me and I still managed to overlook it. Thanks for the clarification!
Page 38 says we update weights and biases by adding the parameter to the negative gradient learning rate. However, in the Adaline code sections, pages 40 and 49, no negative gradient is used. For example, page 49, `self.w_ += self.eta 2.0 xi (error)`.