mrdbourke / pytorch-deep-learning

Materials for the Learn PyTorch for Deep Learning: Zero to Mastery course.
https://learnpytorch.io
MIT License
10.25k stars 3.04k forks source link

Different inputs on Conv2d #763

Open kumarsiddappa-git opened 9 months ago

kumarsiddappa-git commented 9 months ago

What is the input we need to provide for the model Conv2d (Tiny VGG) . When i send the image in shape [1,28,28] it fails , but when i unsqueeze(dim=0) to dim=0 which becomes the shape [1,1,28,28]

do we need to send the [batchsize,color channel, image width and image height]

my Model is

class FashionMNISTModelV2(nn.Module):
  """
  Model architecture that replicates the TinyVGG mofrl from CCN explainer https://poloclub.github.io/cnn-explainer/
  """

  def __init__(self,input_shape:int , hidden_units:int, output_shape: int):
    super().__init__()
    self.conv_block_1 = nn.Sequential(
        nn.Conv2d(in_channels=input_shape,
                  out_channels=hidden_units,
                  kernel_size=3,
                  stride=1,
                  padding=1),
        nn.ReLU(),
        nn.Conv2d(in_channels=hidden_units,
                  out_channels=hidden_units,
                  kernel_size=3,
                  padding=1,
                  stride=1),
        nn.ReLU(),
        nn.MaxPool2d(kernel_size=2)
    )
    self.conv_block_2 = nn.Sequential(
        nn.Conv2d(in_channels=hidden_units,
                  out_channels=hidden_units,
                  kernel_size=3,
                  stride=1,
                  padding=1),
        nn.ReLU(),
        nn.Conv2d(in_channels=hidden_units,
                  out_channels=hidden_units,
                  kernel_size=3,
                  stride=1,
                  padding=1),
        nn.ReLU(),
        nn.MaxPool2d(kernel_size=2)

    )
    self.classifier = nn.Sequential(
        nn.Flatten(),
        nn.Linear(in_features=hidden_units,
                  out_features=output_shape
                )
    )

  def forward(self,x):
    x= self.conv_block_1(x)
    print("block 1=",x.shape)
    x = self.conv_block_2(x)
    print("block 2=",x.shape)
    x = self.classifier(x)
    print(x.shape)
    return x

The model instantiation

torch.manual_seed(42)
modelv2=FashionMNISTModelV2(input_shape=1,
                            hidden_units=10,
                            output_shape=len(class_names)).to(device)
modelv2.parameters()
modelv2

Model train

print(image.shape)
y_pred = modelv2(image)

label=torch.argmax(y_pred)

Error thrown is

torch.Size([1, 28, 28])
block 1= torch.Size([10, 14, 14])
block 2= torch.Size([10, 7, 7])
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
[<ipython-input-19-d0bea8693f34>](https://localhost:8080/#) in <cell line: 2>()
      1 print(image.shape)
----> 2 y_pred = modelv2(image)
      3 
      4 label=torch.argmax(y_pred)
      5 print(label)

8 frames
[/usr/local/lib/python3.10/dist-packages/torch/nn/modules/linear.py](https://localhost:8080/#) in forward(self, input)
    112 
    113     def forward(self, input: Tensor) -> Tensor:
--> 114         return F.linear(input, self.weight, self.bias)
    115 
    116     def extra_repr(self) -> str:

RuntimeError: mat1 and mat2 shapes cannot be multiplied (10x49 and 10x10)

so i got the image shape is not proper when pushed to linear of classifier

when its changed the in_features = hidden_features77 , it goes to 490 , but mat problem

torch.Size([1, 28, 28])
block 1= torch.Size([10, 14, 14])
block 2= torch.Size([10, 7, 7])
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
[<ipython-input-30-d0bea8693f34>](https://localhost:8080/#) in <cell line: 2>()
      1 print(image.shape)
----> 2 y_pred = modelv2(image)
      3 
      4 label=torch.argmax(y_pred)
      5 print(label)

8 frames
[/usr/local/lib/python3.10/dist-packages/torch/nn/modules/linear.py](https://localhost:8080/#) in forward(self, input)
    112 
    113     def forward(self, input: Tensor) -> Tensor:
--> 114         return F.linear(input, self.weight, self.bias)
    115 
    116     def extra_repr(self) -> str:

RuntimeError: mat1 and mat2 shapes cannot be multiplied (10x49 and 490x10)

Can you please check on this

bhuvanmdev commented 9 months ago

Just changing the input shape in the nn.linear region to 49 should solve the problem ig..

nickyreinert commented 9 months ago

I suggest to add a "debug print" in your forward method like this:

(note: add log_shapes:bool = True to your model's class)

    def forward(self, x):

        if self.log_shapes : print(f"\nInput shape: {x.shape}")
        x = self.conv_block_1(x)
        if self.log_shapes : print(f"Conv Block 1 returns a shape of: {x.shape}")
        x = self.conv_block_2(x)
        if self.log_shapes : print(f"Conv Block 2 returns a shape of: {x.shape}")
        x = self.classifier(x)
        if self.log_shapes : print(f"Classifier returns a shape of (should fit count of  classes): {x.shape}")

        self.log_shapes = False

        return x

This way you see the shape of the data within your layers.

Additional note: For productive use always combine all layer steps for the sake of performance:

return self.classifier(self.conv_block_2((self.conv_block_1(x))))

LuluW8071 commented 7 months ago
self.classifier = nn.Sequential(
    nn.Flatten(),
    nn.Linear(in_features=hidden_units,
              out_features=output_shape
            )

The linear _inputfeat (matrix 2 no of rows) should match your 2nd conv2d _outputfeat (matrix 1 no of column)

RuntimeError: mat1 and mat2 shapes cannot be multiplied (10x49 and 490x10)
RuntimeError: mat1 and mat2 shapes cannot be multiplied (10x49 and 10x10)