yihuacheng / IVGaze

What Do You See in Vehicle? Comprehensive Vision Solution for In-Vehicle Gaze Estimation
Apache License 2.0
27 stars 3 forks source link

Question about RuntimeError #4

Open D-VVY opened 2 months ago

D-VVY commented 2 months ago

=====================>> (End) Traning params << ======================= ===> Read data <=== -- [Read Data]: Total num: 29538 -- [Read Data]: Source: ['./data/Norm/label_class/train2.txt', './data/Norm/label_class/train3.txt'] ===> Model building <=== Downloading: "https://download.pytorch.org/models/resnet18-5c106cde.pth" to /root/.cache/torch/hub/checkpoints/resnet18-5c106cde.pth 100.0% ===> optimizer building <=== ===> Training <=== Traceback (most recent call last): File "trainer/leave.py", line 184, in main(configi) File "trainer/leave.py", line 133, in main loss, losslist = net.loss(data, anno) File "/root/autodl-tmp/IVGaze-1/model.py", line 96, in loss gaze, , loss_gaze_o, loss_gaze_n = self.forward(x_in) File "/root/autodl-tmp/IVGaze-1/model.py", line 58, in forward feature_o, feature_list_o= self.borigin(x_in['origin_face']) File "/root/miniconda3/envs/gaze/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl return forward_call(*input, *kwargs) File "/root/autodl-tmp/IVGaze-1/IVModule.py", line 400, in forward feature = self.transformer(feature, self.outFeatureNum) File "/root/miniconda3/envs/gaze/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl return forward_call(input, **kwargs) File "/root/autodl-tmp/IVGaze-1/IVModule.py", line 342, in forward feature_in = torch.cat([cls, feature], 0) RuntimeError: Tensors must have same number of dimensions: got 3 and 2

When I run it locally, it shows a dimension mismatch. Has anyone encountered the same problem? Can anyone give me some advice? Thanks!

CrossEntropy commented 2 months ago

Hi, D! @D-VVY, I think the problem with this step is the 1x1 convolution operation. To be precise, it is the squeeze operation that turns the four-dimensional tensor into a two-dimensional one. You can modify the following code to try it out and hope it works!

class conv1x1(nn.Module):

    def __init__(self, in_planes, out_planes, stride=1):
        super(conv1x1, self).__init__()

        self.conv = nn.Conv2d(in_planes, out_planes, kernel_size=1, stride=stride,
                     padding=0, bias=False)

        self.bn = nn.BatchNorm2d(out_planes)

        self.avgpool = nn.AdaptiveAvgPool2d((1, 1)) 

    def forward(self, feature):
        output = self.conv(feature)
        output = self.bn(output)
        output = self.avgpool(output)
        # output = output.squeeze()  # old operation
        # new operation
        bs, c, h, w = output.shape
        output = torch.reshape(output, (bs, c * h * w))
        return output