Training Error about yolov4-tiny in branch u5

Hi all

I am trying to train yolov4-tiny for VisDrone dataset which includes 10 classes in the branch 'u5' which is described as follows:

# parameters
nc: 10  # number of classes (I modified the number of class from 80 to 10!)
depth_multiple: 1.0  # expand model depth
width_multiple: 1.0  # expand layer channels

# anchors
anchors:
  - [23,27,  37,58,  81,82]  # P4/16
  - [81,82,  135,169,  344,319]  # P5/32

# CSPVoVNet backbone
backbone:
  # [from, number, module, args]
  [[-1, 1, Conv, [32, 3, 2]],  # 0-P1/2
   [-1, 1, Conv, [64, 3, 2]],  # 1-P2/4

   [-1, 1, Conv, [64, 3, 1]],
   [-1, 1, VoVCSP, [64]],
   [[-2, -1], 1, Concat, [1]],
   [-1, 1, MP, [2]],  # 5-P3/8

   [-1, 1, Conv, [128, 3, 1]],
   [-1, 1, VoVCSP, [128]],
   [[-2, -1], 1, Concat, [1]],
   [-1, 1, MP, [2]],  # 9-P4/16

   [-1, 1, Conv, [256, 3, 1]],
   [-1, 1, VoVCSP, [256]],
   [[-2, -1], 1, Concat, [1]],
   [-1, 1, MP, [2]],  # 13-P5/32

   [-1, 1, Conv, [512, 3, 1]],  # 14
  ]

# yolov4-tiny head
# na = len(anchors[0])
head:
  [[-1, 1, Conv, [256, 1, 1]],
   [-1, 1, Conv, [512, 3, 1]],
   [-1, 1, nn.Conv2d, [na * (nc + 5), 1, 1]],

   [-2, 1, Conv, [256, 1, 1]],
   [-1, 1, nn.Upsample, [None, 2, 'nearest']],
   [[-1, 11], 1, Concat, [1]],
   [-1, 1, Conv, [256, 3, 1]],
   [-1, 1, nn.Conv2d, [na * (nc + 5), 1, 1]],

   [[], 1, Detect, [nc, anchors]],   # Detect(P4, P5)
  ]

and I run the command

python train.py --data VisDrone.yaml --cfg yolov4-tiny.yaml --weights '' --device 2 --batch-size 16

and the output is as follows:


                 from  n    params  module                                  arguments                     
  0                -1  1       928  models.common.Conv                      [3, 32, 3, 2]                 
  1                -1  1     18560  models.common.Conv                      [32, 64, 3, 2]                
  2                -1  1     36992  models.common.Conv                      [64, 64, 3, 1]                
  3                -1  1     22784  models.common.VoVCSP                    [64, 64, 1]                   
  4          [-2, -1]  1         0  models.common.Concat                    [1]                           
  5                -1  1         0  models.common.MP                        [2]                           
  6                -1  1    147712  models.common.Conv                      [128, 128, 3, 1]              
  7                -1  1     90624  models.common.VoVCSP                    [128, 128, 1]                 
  8          [-2, -1]  1         0  models.common.Concat                    [1]                           
  9                -1  1         0  models.common.MP                        [2]                           
 10                -1  1    590336  models.common.Conv                      [256, 256, 3, 1]              
 11                -1  1    361472  models.common.VoVCSP                    [256, 256, 1]                 
 12          [-2, -1]  1         0  models.common.Concat                    [1]                           
 13                -1  1         0  models.common.MP                        [2]                           
 14                -1  1   2360320  models.common.Conv                      [512, 512, 3, 1]              
 15                -1  1    131584  models.common.Conv                      [512, 256, 1, 1]              
 16                -1  1   1180672  models.common.Conv                      [256, 512, 3, 1]              
 17                -1  1     23085  torch.nn.modules.conv.Conv2d            [512, 45, 1, 1]               
 18                -2  1    131584  models.common.Conv                      [512, 256, 1, 1]              
 19                -1  1         0  torch.nn.modules.upsampling.Upsample    [None, 2, 'nearest']          
 20          [-1, 11]  1         0  models.common.Concat                    [1]                           
 21                -1  1   1180160  models.common.Conv                      [512, 256, 3, 1]              
 22                -1  1     11565  torch.nn.modules.conv.Conv2d            [256, 45, 1, 1]               
 23                []  1         0  models.yolo.Detect                      [10, [[23, 27, 37, 58, 81, 82], [81, 82, 135, 169, 344, 319]], []]
Traceback (most recent call last):
  File "train.py", line 468, in <module>
    train(hyp, tb_writer, opt, device)
  File "train.py", line 80, in train
    model = Model(opt.cfg, nc=nc).to(device)
  File "/root/YOLOv4/models/yolo.py", line 70, in __init__
    m.stride = torch.tensor([s / x.shape[-2] for x in self.forward(torch.zeros(1, ch, s, s))])  # forward
  File "/root/YOLOv4/models/yolo.py", line 99, in forward
    return self.forward_once(x, profile)  # single-scale inference, train
  File "/root/YOLOv4/models/yolo.py", line 119, in forward_once
    x = m(x)  # run
  File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/root/YOLOv4/models/yolo.py", line 27, in forward
    x[i] = self.m[i](x[i])  # conv
  File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/container.py", line 164, in __getitem__
    return self._modules[self._get_abs_string_index(idx)]
  File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/container.py", line 154, in _get_abs_string_index
    raise IndexError('index {} is out of range'.format(idx))
IndexError: index 0 is out of range

the other network architectures such as yolov4s-mish.cfg so on works fine but only the yolov4-tiny results the error.

Is there any solution?

Thanks.

WongKinYiu / PyTorch_YOLOv4

Training Error about yolov4-tiny in branch u5 #423