VIOLINARTHUR / HKU-DASC7606-A1

26 stars 2 forks source link

Errors happened in Train the convolutional neural network Step #7

Open LonelyStalker opened 2 weeks ago

LonelyStalker commented 2 weeks ago

I defined the Convolutional Network as the provided architecture. Here is the code:


class ConvolutionalNet(nn.Module):
    def __init__(self):
        super().__init__()

        # Define 5 convolutional layers
        # conv2d, 5x5, 3->8, padding=2
        self.conv1 = nn.Conv2d(in_channels=3, out_channels=8, kernel_size=5, padding=2)
        # conv2d, 5x5, 8->16, padding=2, stride=2
        self.conv2 = nn.Conv2d(in_channels=8, out_channels=16, kernel_size=5, padding=2, stride=2)
        # conv2d, 5x5, 16->32, padding=2
        self.conv3 = nn.Conv2d(in_channels=16, out_channels=32, kernel_size=5, padding=2)
        # conv2d, 5x5, 32->64, padding=2, stride=2
        self.conv4 = nn.Conv2d(in_channels=32, out_channels=64, kernel_size=5, padding=2, stride=2)
        # conv2d, 5x5, 64->128, padding=2
        self.conv5 = nn.Conv2d(in_channels=64, out_channels=128, kernel_size=5, padding=2)

        # Define max pooling layer
        self.pool = nn.MaxPool2d(kernel_size=2, stride=2)

        # Define 3 fully connected layers
        # fc, the whole feature map -> 120
        self.fc1 = nn.Linear(128 * 8 * 8, 120)
        # fc, 120 -> 84
        self.fc2 = nn.Linear(120, 84)
        # fc, 84 -> 10
        self.fc3 = nn.Linear(84, 10)

        pass

    def forward(self, x):
        # Forward 5 convolutional layer
        x = F.relu(self.conv1(x))
        x = F.relu(self.conv2(x))
        x = F.relu(self.conv3(x))
        x = F.relu(self.conv4(x))
        x = F.relu(self.conv5(x))

        # Forward max pooling layer
        x = self.pool(x)

        # Flatten the feature map
        x = x.view(-1, 128 * 8 * 8)

        # Forward 3 fully connected layers with ReLU
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        x = self.fc3(x)

        pass
        return x

cn_net = ConvolutionalNet()

But during the training step, an valueerror appeared. Here is the code and erro traceback:


# Get optimizer
optimizer = get_optimizer(cn_net, 0.01)

# Train the network
train(cn_net, trainloader, optimizer, 5)

ValueError                                Traceback (most recent call last)
Cell In[21], line 5
      2 optimizer = get_optimizer(cn_net, 0.01)
      4 # Train the network
----> 5 train(cn_net, trainloader, optimizer, 5)

Cell In[7], line 24, in train(net, loader, optimizer, max_epoch)
     21 optimizer.zero_grad()
     23 # forward + backward + optimize
---> 24 outputs, loss, labels = forward_step(net, images, labels)
     25 loss.backward()
     26 optimizer.step()

Cell In[7], line 3, in forward_step(net, inputs, labels)
      1 def forward_step(net, inputs, labels):
      2     outputs = net(inputs)
----> 3     loss = criterion(outputs, labels)
      4     return outputs, loss, labels

File [~/anaconda3/envs/cv_env/lib/python3.10/site- 
packages/torch/nn/modules/module.py:1501](http://localhost:8888/lab/tree/HKU-DASC7606-A1/~/anaconda3/envs/cv_env/lib/python3.10/site- 
packages/torch/nn/modules/module.py#line=1500), in Module._call_impl(self, *args, **kwargs)
   1496 # If we don't have any hooks, we want to skip the rest of the logic in
   1497 # this function, and just call forward.
   1498 if not (self._backward_hooks or self._backward_pre_hooks or self._forward_hooks or self._forward_pre_hooks
   1499         or _global_backward_pre_hooks or _global_backward_hooks
   1500         or _global_forward_hooks or _global_forward_pre_hooks):
-> 1501     return forward_call(*args, **kwargs)
   1502 # Do not call functions when jit is used
   1503 full_backward_hooks, non_full_backward_hooks = [], []

File [~/anaconda3/envs/cv_env/lib/python3.10/site-packages/torch/nn/modules/loss.py:1174](http://localhost:8888/lab/tree/HKU-DASC7606-A1/~/anaconda3/envs/cv_env/lib/python3.10/site-packages/torch/nn/modules/loss.py#line=1173), in CrossEntropyLoss.forward(self, input, target)
   1173 def forward(self, input: Tensor, target: Tensor) -> Tensor:
-> 1174     return F.cross_entropy(input, target, weight=self.weight,
   1175                            ignore_index=self.ignore_index, reduction=self.reduction,
   1176                            label_smoothing=self.label_smoothing)

File [~/anaconda3/envs/cv_env/lib/python3.10/site-packages/torch/nn/functional.py:3029](http://localhost:8888/lab/tree/HKU-DASC7606-A1/~/anaconda3/envs/cv_env/lib/python3.10/site-packages/torch/nn/functional.py#line=3028), in cross_entropy(input, target, weight, size_average, ignore_index, reduce, reduction, label_smoothing)
   3027 if size_average is not None or reduce is not None:
   3028     reduction = _Reduction.legacy_get_string(size_average, reduce)
-> 3029 return torch._C._nn.cross_entropy_loss(input, target, weight, _Reduction.get_enum(reduction), ignore_index, label_smoothing)

ValueError: Expected input batch_size (16) to match target batch_size (64).

I did't change any part of provided code, especially the part about batch_size. So this is quite confusing for me.

Is that because the provided train() funtion is not suitble for CNN and I should white a new train() function for batch_size = 16, or there is someting wrong with my CNN Defination?

LonelyStalker commented 2 weeks ago

If I change the defination of batch_size to 16, 256, or any other value, the error traceback always reports the input batch_size is 1/4 of target batch_size. Here is the further valueError info: ValueError: Expected input batch_size (4) to match target batch_size (16). ValueError: Expected input batch_size (64) to match target batch_size (256).