MichalDanielDobrzanski / DeepLearningPython

neuralnetworksanddeeplearning.com integrated scripts for Python 3.5.2 and Theano with CUDA support
MIT License
2.79k stars 1.27k forks source link

np.dot(w, activation) throws an error in backprop(x, y) #29

Open xXCoolinXx opened 4 years ago

xXCoolinXx commented 4 years ago

This code

for b, w in zip(self.biases, self.weights):
            z = np.dot(w, activation)+b
            zs.append(z)
            activation = sigmoid(z)
            activations.append(activation)

throws the following error, and I don't really know why. (I am using Python 3.8.3 and numpy 1.18.4)

ValueError: operands could not be broadcast together with shapes (10,784) (10,30)

xXCoolinXx commented 4 years ago

Note also that VS Code says that the shapes of w and activation were (10, 30) and (30, 784) respectively. This seems to be different from the shapes in the error.

Willt125 commented 4 years ago

I can confirm on Python 3.8.2 with Numpy 1.18.3 that this error occurs (albeit with recorded sizes of (30, 1) and (10, 1)), with both the 2.x and 3.x versions of the code.

The line in question is line 123 of network.py.

doublethink13 commented 4 years ago

me too! any fixes?

miles-1 commented 4 years ago

I'm late to the game, but I'm curious for other people if they were using the load_data function instead of the load_data_wrapper function from the mnist_loader file.

I'd be surprised if others made the same mistake that I did, but I thought I had the same problem in this ticket for a while until I realized my mistake. That's what was going wrong for me.

I just think a problem big enough that the vectors can't match for dot products is a good indicator that it's probably a math logic problem and not a python/numpy version issue.

I'm using Python 3.6.9 and Numpy 1.19.1

hamolicious commented 4 years ago

I'm late to the game, but I'm curious for other people if they were using the load_data function instead of the load_data_wrapper function from the mnist_loader file.

training_data, validation_data, test_data = load_data() returns ValueError: shapes (16,2296) and (50000,784) not aligned: 2296 (dim 1) != 50000 (dim 0) and training_data, validation_data, test_data = load_data_wrapper() returns ValueError: setting an array element with a sequence.

I also wrote my own function to load and format the data and that throws ValueError: shapes (16,2296) and (784,) not aligned: 2296 (dim 1) != 784 (dim 0)

All 3 are complaining about line 101 (z = np.dot(w, activation) + b) and I can't seem to find the issue with either?!

yc-cui commented 2 years ago

fixed. change np.dot(delta, activations[-2].transpose()) to np.outer(delta, activations[-2].transpose())

The shape of the matrices are (a, ) and (b, ), respectively. When we use np.dot while the matrices are 1d both, it cannot give us a result of axb. Use np.outer instead to get what we want.