Closed chowdhuryj-github closed 2 weeks ago
Here is the following information gathered from this link: https://www.youtube.com/watch?v=YRhxdVk_sIs
CNN's process images by detecting patterns, layer by layer. They start off with small filters that slide over the image to find simple patterns such as edges, which are then combined to detect shapes, and then complex features like objects.
Pooling layers reduce data, keeping only essential information, helping the network to focus. These fully connected layers interpret the pattern to make predictions, allowing CNN's to classify images accurately.
Input: Imagine you have a small, grayscale (black-and-white) image of a handwritten "3". Each pixel has a value representing brightness (0 for black, 255 for white).
Convolution Layer: The CNN starts by applying small filters, like a 3x3 grid, that slide over the image. One filter might detect vertical lines, another horizontal lines. When a filter finds something that looks like a line, it highlights that area in a "feature map."
Pooling Layer: The CNN simplifies each feature map by keeping only the most important values in each small region, making the data smaller but still highlighting key features like curves or corners.
Fully Connected Layer: After several layers of finding and pooling features, the CNN has a "picture" of important elements in the number. It combines these to decide what the number is most likely to be.
Output: Based on the features detected, the CNN might output a high probability for "3," meaning it’s confident the image is the number 3.
Here are the following deliverables: