meituan / YOLOv6

YOLOv6: a single-stage object detection framework dedicated to industrial applications.
GNU General Public License v3.0
5.72k stars 1.03k forks source link

UNet+YOLO #352

Closed geekdreamer04 closed 2 years ago

geekdreamer04 commented 2 years ago

When I am trying to add a few upsampling and downsampling layers before the Efficientrep backbone, I am facing the following issue.

ERROR in training steps. ERROR in training loop or eval/save model.

Training completed in 0.000 hours. Traceback (most recent call last): File "tools/train.py", line 112, in main(args) File "tools/train.py", line 102, in main trainer.train() File "/workspace/YOLOv61/yolov6/core/engine.py", line 75, in train self.train_in_loop() File "/workspace/YOLOv61/yolov6/core/engine.py", line 88, in train_in_loop self.train_in_steps() File "/workspace/YOLOv61/yolov6/core/engine.py", line 104, in train_in_steps preds = self.model(images) File "/home/ubuntu/anaconda3/envs/pytorch_p38/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl return forward_call(*input, kwargs) File "/workspace/YOLOv61/yolov6/models/yolo.py", line 39, in forward x = self.detect(x) File "/home/ubuntu/anaconda3/envs/pytorch_p38/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl return forward_call(*input, *kwargs) File "/workspace/YOLOv61/yolov6/models/effidehead.py", line 60, in forward x[i] = self.stemsi File "/home/ubuntu/anaconda3/envs/pytorch_p38/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl return forward_call(input, kwargs) File "/workspace/YOLOv61/yolov6/layers/common.py", line 102, in forward return self.act(self.bn(self.conv(x))) File "/home/ubuntu/anaconda3/envs/pytorch_p38/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl return forward_call(*input, kwargs) File "/home/ubuntu/anaconda3/envs/pytorch_p38/lib/python3.8/site-packages/torch/nn/modules/conv.py", line 446, in forward return self._conv_forward(input, self.weight, self.bias) File "/home/ubuntu/anaconda3/envs/pytorch_p38/lib/python3.8/site-packages/torch/nn/modules/conv.py", line 442, in _conv_forward return F.conv2d(input, weight, bias, self.stride, RuntimeError: Given groups=1, weight of size [256, 256, 1, 1], expected input[8, 128, 20, 20] to have 256 channels, but got 128 channels instead**

Also this is the code for the UNet structure that I have incorporated.

import torch import torch.nn as nn import math from yolov6.layers.common import *

from .UNet_parts import *

def double_conv(in_channels, out_channels): return nn.Sequential( nn.Conv2d(in_channels, out_channels, 3, padding=1), nn.ReLU(inplace=True), nn.Conv2d(out_channels, out_channels, 3, padding=1), nn.ReLU(inplace=True) )

class UNet(nn.Module): def init( self, in_channels=3, channels_list=None, num_repeats= None, bilinear = None, ): super().init()

    self.dconv_down1 = double_conv(in_channels, 64)
    self.dconv_down2 = double_conv(64, 128)
    self.dconv_down3 = double_conv(128, 256)
    self.dconv_down4 = double_conv(256, 512)        

    self.maxpool = nn.MaxPool2d(2)
    self.upsample = nn.Upsample(scale_factor=2, mode='bilinear', align_corners=True)        

    self.dconv_up3 = double_conv(256 , 256)
    self.dconv_up2 = double_conv(128 , 128)
    self.dconv_up1 = double_conv(128 , channels_list[0])

    # self.conv_last = nn.Conv2d(64,channels_list[0], 1)

def forward(self, x):
    conv1 = self.dconv_down1(x)
    x = self.maxpool(conv1)

    conv2 = self.dconv_down2(x)
    x = self.maxpool(conv2)

    conv3 = self.dconv_down3(x)
    x = self.maxpool(conv3)   

    x = self.dconv_down4(x)

    x = self.upsample(x)        
    x = torch.cat([x, conv3], dim=1)

    x = self.dconv_up3(x)
    x = self.upsample(x)        
    x = torch.cat([x, conv2], dim=1)       

    x = self.dconv_up2(x)
    x = self.upsample(x)        
    x = torch.cat([x, conv1], dim=1)   

    out = self.dconv_up1(x)

    # out = self.conv_last(x)
    return out
mtjhl commented 2 years ago

Hi, please check the channels, the in channels of one conv should be the output channels of the previous layer.