Closed 423data closed 3 months ago
In this slide, you can think of the input image as a color image with RGB channels and a 32(height) x 32(width) image size.
3: the number of channels in the input image I think most images we see in this class are in RGB format, so 3 is the number of channels, red, green, and blue.
32 : height of the input image
32 : width of the input image
10 : the number of neurons in the hidden layer
In a "Fully Connected" Layer, as you can see in this layer name, each input node is connected to "every node" in the next layer. Since we are dealing with a 2D image, but we need to feed this 2D image into the Fully Connected Layer, we have to "flatten" this image into a 1D vector. This is why we multiply(3 x 32 x 32) the number of channels, and height and width of this image.
The dimension of weight matrix W is (3 x 32 x 32) x 10. Here, (3 x 32 x 32) input features are connected to each of the 10 neurons in the hidden layer.
I think this picture may help your understanding. Just think of it as a single-channel, 3 x 3 image size and 4 hidden neurons.
If there's any part you don't understand, feel free to ask:)
Thank you so much for your response! I understood the concept well because you even attached an image that I can understand easily!!
First, I checked issue #39. I'm curious about the dimension of W. Like (3x32x32)x(#hidden=10)=30K, I'm curious about what 3 is, what is the first 32 and what is the second 32? I've never studied neural networks, so please understand that I lack relevant knowledge.