Open nickvazz opened 2 years ago
If you have issues, you can comment here and its pretty easy to make nice looking text formatting with markdown https://guides.github.com/features/mastering-markdown/
@nickvazz Hi Nick, I have some questions:
May you check for me why I can't put images in the training set into subplot like in the test set? Link
I understand the codes where they prepare data, scale images, change images into 1D array, convert class vectors to binary class matrices (like number 5 we have [0,0,0,0,0,1,0,0,0,0]. I learned a few things about layers in sequential model but not all, like the first layer needs to have input shape; conv2d and maxpooling2d layers downsample image features map; dropout layer helps avoid overfitting model; changing order of layers affects the end result. But I guess I need to read more about Sequential model like one-hot vector and how it relates to the loss function categorical_crossentrophy (if it relates), how to choose loss functions; meaning of each number in layers' output shape; terms like batch size, epochs, activation, optimizer, metrics; way to create validation set, then test training vs validation set.
Can we put our own image of handwritten digits and let the model predicts what digits we have?
Hey @truc-h-nguyen, no problem!
1) Looks like in the test set it has a typo that makes it work, and in the train set it doesn't have the same typo but has a small easily fixed error.
Basically `plt.show()` makes the plot and in the `test` case, you never _called_ `plt.show` fully since it requires the `()` to get called. In the `train` case you call it correctly but inside the the most nested loop so it shows after each plot.:
index=0
for nrows in range(3):
for ncols in range(3):
plt.subplot(3,3,index+1)
plt.imshow(x_test[index], cmap="Pastel1")
index+=1
plt.show()
and
n_rows = 3
n_cols = 3
i=0
for row in range(n_rows):
for col in range(n_cols):
plt.subplot(n_rows,n_cols,i+1)
plt.imshow(x_train[i], cmap='Pastel1')
i+=1
plt.show()
2) Yep, all those things are true.
conv2d
doesn't always return a smaller shape. It can return the exact same shape if you change the padding argument when you create a conv2d
layer.
padding: one of "valid" or "same" (case-insensitive). "valid" means no padding. "same" results in padding with zeros evenly to the left/right or up/down of the input such that output has the same height/width dimension as the input.
Dropout
helps with overfitting during training by randomly removing some node connections so that it's not too dependent on a specific connection but you have to remember to take it out when you actually are using the model for predictions.1,2,3,4,5,6,7,8,9,0
because it wants to make it clear that 7
is not close to 8
or 6
. It's also commonly used to the turn words into numbers for Natural Language Processing models. categorical cross-entropy
, it comes from Information Theory and this page gives a pretty good quick understanding of what it's good for.sklearn
to do lots of common preprocessing
3) We totally can!
from skimage.transform import resize
import matplotlib.image as mpimg
# this is a link to the image above
# you can copy and paste images into comments (while you're making them) and get a free link
img_path = "https://user-images.githubusercontent.com/8573886/138998248-b7753983-a85b-472b-b62d-892fe4be2a8c.png"
img = mpimg.imreadimg_path)
img_resized = resize(img, (28, 28))
print(img_resized.shape, img.shape, x_test[0].shape)
will print
((28, 28, 4), (334, 496, 4), (28, 28, 1))
img_resized[:,:,0]
as the first two :
are for the rows and columns of the imagebatch, rows, columns, colors
x_test[0].shape
-1
used in the front just tells the np.array
to do what ever it needs to keep the other 28,28,1
shape correct. If we were to have some np.array
that was 3920 items long (28x28x5), doing the .reshape(-1,28,28,1)
would results in a shape of 5, 28, 28, 1
model.predict(img_resized[:,:,0].reshape(-1,28,28,1))
Hopefully all that helps!
Thank you for your detailed explanation! It really helps me understand more about the model and related information. I'm learning more about convolutional layers, and coming across this video combining with what you said last Thursday; the topic is getting a lot clearer for me. I'm a bit confused about the convolution kernels like how they are chosen, especially when we have 2 or more convolutional layers, but I guess I'll talk more about my question over the phone.
good example of bbox issues: https://github.com/matterport/Mask_RCNN