MichalDanielDobrzanski / DeepLearningPython

neuralnetworksanddeeplearning.com integrated scripts for Python 3.5.2 and Theano with CUDA support
MIT License
2.79k stars 1.27k forks source link

Missing sigmoid_prime() #17

Closed DOUS3L closed 3 years ago

DOUS3L commented 5 years ago

https://github.com/MichalDanielDobrzanski/DeepLearningPython35/blob/ea229ac6234b7f3373f351f0b18616ca47edb8a1/network2.py#L253

here it should be delta = self.cost_derivative(activations[-1], y) * sigmoid_prime(zs[-1]) the code from the book does that image

cgarbin commented 5 years ago

@DOUS3L I'm using this code for a class exercise and came across your comment. Would you mind clarifying where you see the code in the book dividing by sigmoid_prime(sz[-1])?

The code in chapter 2 does that. However, network2.py comes from this section in chapter 3.

The original Python 2 code from the author is also available here.

The code looks like this:

    # backward pass
    delta = (self.cost).delta(zs[-1], activations[-1], y)
    nabla_b[-1] = delta
    nabla_w[-1] = np.dot(delta, activations[-2].transpose())

This was done because in network.py the loss function is quadratic, while in network2.py it's the cross-entropy (log) function. There is an explanation in chapter 3 about choosing the learning rate value that points that out:

As we saw earlier, the gradient terms for the quadratic cost have an extra σ′=σ(1−σ) term in them.

MichalDanielDobrzanski commented 3 years ago

The cost can be defined in the main test.py script for running the network learning algorithm. Hence, this is an basic implementation which should not assume to use the specific loss funciton.