RuntimeError: shape '[4, 5, 8, 8, 8, 64]' is invalid for input of size 696320

Tracy-git commented 2 years ago

i use yolov7-swin config to train voc2007,when model run ro val.run(), run this error，while it training successfully but val x = x.view(B, H // window_size, window_size, W // window_size, window_size, C) RuntimeError: shape '[4, 5, 8, 8, 8, 64]' is invalid for input of size 696320

iscyy commented 2 years ago

Is the specific yolov7-swin the built-in configuration you use, or the configuration you modified yourself?

Tracy-git commented 2 years ago

I used the config to test under the yolov7-transformer-Improved folder provided by you.this erro only ，This kind of error only occurs in the val process, if I set to noval, no error occurs

iscyy commented 2 years ago

I just tested one of the swin configurations under the yolov7-transformer-Improved folder and there is no such problem as you said, val can run normally

Tracy-git commented 2 years ago

I just tested one of the swin configurations under the yolov7-transformer-Improved folder and there is no such problem as you said, val can run normally

Thank you for your reply. Specifically, I use yolov7-Swin-HorNet-attention.yaml, and the image size is set to 512 or 640, and this error is raised. Could you provide the model configuration in your train.py

iscyy commented 2 years ago

The configurations in train.py are all default, You can make sure you are using the latest code

Tracy-git commented 2 years ago

Sorry to bother you, I am also using the default configuration now, my question is why this error is not raised during training, but this error is raised during validation after training an epoch, I think it should be the image size The problem

Sorry to bother you, I am also using the default configuration now, my question is why this error is not raised during training, but this error is raised during validation after training an epoch, I think it should be the image size The problem

The configurations in train.py are all default, You can make sure you are using the latest code

Sorry to bother you, I am also using the default configuration now, my question is why this error is not raised during training, but this error is raised during validation after training an epoch, I think it should be the image size The problem

Tracy-git commented 2 years ago

The configurations in train.py are all default, You can make sure you are using the latest code

I seem to have solved this problem. When the input size is greater than 1024, the model can be successfully trained and validated. When the input size is smaller than 1024, only training can be performed, and this error will be raised during valid. I don't know why

iscyy commented 2 years ago

Can you tell me what train command you are using? This should have nothing to do with the image size

Tracy-git commented 2 years ago

First of all thanks for your reply，My training configuration is ：--cfg configs\yolov7-transformer-Improved\yolov7-Swin-ConvNext.yaml --data data\MYVOC.yaml --otaloss yolov7 --batch-size 2 --imgsz 1080 --epochs 100. the error is like follow picture

iscyy commented 2 years ago

use python train.py --cfg configs/yolov7-transformer-Improved/yolov7-Swin-ConvNext.yaml --otaloss yolov7, ( --img<640), There is no such thing as you said

iscyy commented 2 years ago

hi, That's about it The map is 0 because only one epoch was tested, and it is the coco128 data set, a few more epochs are enough

RogerHuangPKX commented 2 years ago

I found the same error also. When I train my model with configs/yolov7-transformer-Improved/yolov7-Swin-HorNet.yaml, the error will appear in period of val. It said that RuntimeError: shape '[24, 7, 8, 10, 8, 64]' is invalid for input of size 7225344. I don't know why. It can train successfully but not vaildate successfully. My config are as follow:python -m torch.distributed.launch --nproc_per_node 8 --master_port 9527 train.py --workers 16 --device 0,1,2,3,4,5,6,7 --sync-bn --batch-size 96 --data data/apple.yaml --img 1280 --cfg configs/yolov7-transformer-Improved/yolov7-Swin-HorNet.yaml --weights 'yolov7-w6_training.pt' --name yolov7-trans --hyp data/hyps/hyp.scratch-low.yaml

RogerHuangPKX commented 2 years ago

But when I change model to yolov7-transCoT3-HorNet.yaml , there's no error. It seems that there's something wrong with Swin

iscyy commented 2 years ago

@RogerHuangPKX Yes, now the Swin part has been updated to fix this error

Tracy-git commented 2 years ago

when i use your new config file [yolov7-Swin-v2-ConvNext.yaml], a new error comes up . NameError: name 'SwinV2_CSPB' is not defined ,it seems SwinV2_CSPB block you updating is not defined

iscyy commented 2 years ago

You can use the yolov5s-swintransformer_v2.yaml file with the latest code,

Seperendity commented 1 year ago

@iscyy hi，can you tell me why the error occour only in val, i don't find the bug. I use the C3STR found the same error. Thank you so much.

swimmant commented 1 year ago

@iscyy hi, I seem find some reason, C3STR output maybe cause it , when i use C3 , it works.

Rogers98 commented 1 year ago

@iscyy hello, I had met the same problem and wondered if it was the data set that caused the problem.

HOW TO REPRODUCE:

①When I use

python3 train.py --cfg myConfig.yaml --img 640 --batch 64 --epochs 100 --data VisDrone.yaml --weights yolov5s.pt

it training successfully but when val... the console output is as follows

File "/home/workspace/yoloair/models/Models/SwinTransformer.py", line 242, in forward
    x_windows = window_partition(shifted_x, self.window_size)  # nW*B, window_size, window_size, C
  File "/home/workspace/yoloair/models/Models/SwinTransformer.py", line 122, in window_partition
    x = x.view(B, H // window_size, window_size, W // window_size, window_size, C)
RuntimeError: shape '[128, 3, 8, 5, 8, 64]' is invalid for input of size 8257536

②When I change VisDrone.yaml to COCO128.yaml, like:

python3 train.py --cfg myConfig.yaml --img 640 --batch 64 --epochs 100 --data coco128.yaml --weights yolov5s.pt

Training and verification are normal.

VERSION

Neweast isccy-beta.

CONFIG

myConfig.yaml is as follows, only modify C3 to C3STR

nc: 10  # number of classes
depth_multiple: 0.33  # model depth multiple
width_multiple: 0.50  # layer channel multiple
anchors: 3

# YOLOv5 v6.0 backbone
backbone:
  # [from, number, module, args]
  [ [ -1, 1, Conv, [ 64, 6, 2, 2 ] ],  # 0-P1/2
    [ -1, 1, Conv, [ 128, 3, 2 ] ],  # 1-P2/4
    [ -1, 3, C3, [ 128 ] ],
    [ -1, 1, Conv, [ 256, 3, 2 ] ],  # 3-P3/8
    [ -1, 6, C3, [ 256 ] ],
    [ -1, 1, Conv, [ 512, 3, 2 ] ],  # 5-P4/16
    [ -1, 3, C3STR, [ 512 ] ],
    [ -1, 1, Conv, [ 1024, 3, 2 ] ],  # 7-P5/32
    [ -1, 3, C3, [ 1024 ] ],
    [ -1, 1, SPPF, [ 1024, 5 ] ],  # 9
  ]

# YOLOv5 v6.0 head
head:
  [ [ -1, 1, Conv, [ 512, 1, 1 ] ],
    [ -1, 1, nn.Upsample, [ None, 2, 'nearest' ] ],
    [ [ -1, 6 ], 1, Concat, [ 1 ] ],  # cat backbone P4
    [ -1, 3, C3, [ 512, False ] ],  # 13

    [ -1, 1, Conv, [ 256, 1, 1 ] ],
    [ -1, 1, nn.Upsample, [ None, 2, 'nearest' ] ],
    [ [ -1, 4 ], 1, Concat, [ 1 ] ],  # cat backbone P3
    [ -1, 3, C3, [ 256, False ] ],  # 17 (P3/8-small)

    [ -1, 1, Conv, [ 256, 3, 2 ] ],
    [ [ -1, 14 ], 1, Concat, [ 1 ] ],  # cat head P4
    [ -1, 3, C3, [ 512, False ] ],  # 20 (P4/16-medium)

    [ -1, 1, Conv, [ 512, 3, 2 ] ],
    [ [ -1, 10 ], 1, Concat, [ 1 ] ],  # cat head P5
    [ -1, 3, C3, [ 1024, False ] ],  # 23 (P5/32-large)

    [ [ 17, 20, 23 ], 1, Detect, [ nc, anchors ] ],  # Detect(P3, P4, P5)
  ]

OUTPUT

Starting training for 100 epochs...

     Epoch   gpu_mem       box       obj       cls    labels  img_size
      0/99     6.76G    0.1417    0.1173     0.055       493       640: 100%|██████████| 102/102 [01:35<00:00,  1.07it/s]                                                                                                                                                   
               Class     Images     Labels          P          R     mAP@.5 mAP@.5:.95:   0%|          | 0/5 [00:00<?, ?it/s]                                                                                                                                               
Traceback (most recent call last):
  File "/home/workspace/yoloair2/train.py", line 696, in <module>
    main(opt)
  File "/home/workspace/yoloair2/train.py", line 592, in main
    train(opt.hyp, opt, device, callbacks)
  File "/home/workspace/yoloair2/train.py", line 416, in train
    results, maps, _ = val.run(data_dict,
  File "/home/.local/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
    return func(*args, **kwargs)
  File "/home/workspace/yoloair2/val.py", line 196, in run
    out, train_out = model(im) if training else model(im, augment=augment, val=True)  # inference, loss outputs
  File "/home/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/workspace/yoloair2/models/yolo.py", line 166, in forward
    return self._forward_once(x, profile, visualize)  # single-scale inference, train
  File "/home/workspace/yoloair2/models/yolo.py", line 189, in _forward_once
    x = m(x)  # run
  File "/home/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/workspace/yoloair2/models/Models/muitlbackbone.py", line 486, in forward
    return self.cv3(torch.cat((self.m(self.cv1(x)), self.cv2(x)), dim=1))
  File "/home/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/workspace/yoloair2/models/Models/muitlbackbone.py", line 432, in forward
    x = self.blocks(x)
  File "/home/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/.local/lib/python3.10/site-packages/torch/nn/modules/container.py", line 204, in forward
    input = module(input)
  File "/home/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/workspace/yoloair2/models/Models/SwinTransformer.py", line 242, in forward
    x_windows = window_partition(shifted_x, self.window_size)  # nW*B, window_size, window_size, C
  File "/home/workspace/yoloair2/models/Models/SwinTransformer.py", line 122, in window_partition
    x = x.view(B, H // window_size, window_size, W // window_size, window_size, C)
RuntimeError: shape '[128, 3, 8, 5, 8, 64]' is invalid for input of size 8257536
Traceback (most recent call last):
  File "/home/workspace/yoloair2/train.py", line 696, in <module>
    main(opt)
  File "/home/workspace/yoloair2/train.py", line 592, in main
    train(opt.hyp, opt, device, callbacks)
  File "/home/workspace/yoloair2/train.py", line 416, in train
    results, maps, _ = val.run(data_dict,
  File "/home/.local/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
    return func(*args, **kwargs)
  File "/home/workspace/yoloair2/val.py", line 196, in run
    out, train_out = model(im) if training else model(im, augment=augment, val=True)  # inference, loss outputs
  File "/home/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/workspace/yoloair2/models/yolo.py", line 166, in forward
    return self._forward_once(x, profile, visualize)  # single-scale inference, train
  File "/home/workspace/yoloair2/models/yolo.py", line 189, in _forward_once
    x = m(x)  # run
  File "/home/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/workspace/yoloair2/models/Models/muitlbackbone.py", line 486, in forward
    return self.cv3(torch.cat((self.m(self.cv1(x)), self.cv2(x)), dim=1))
  File "/home/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/workspace/yoloair2/models/Models/muitlbackbone.py", line 432, in forward
    x = self.blocks(x)
  File "/home/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/.local/lib/python3.10/site-packages/torch/nn/modules/container.py", line 204, in forward
    input = module(input)
  File "/home/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/workspace/yoloair2/models/Models/SwinTransformer.py", line 242, in forward
    x_windows = window_partition(shifted_x, self.window_size)  # nW*B, window_size, window_size, C
  File "/home/workspace/yoloair2/models/Models/SwinTransformer.py", line 122, in window_partition
    x = x.view(B, H // window_size, window_size, W // window_size, window_size, C)
RuntimeError: shape '[128, 3, 8, 5, 8, 64]' is invalid for input of size 8257536

EudicL commented 7 months ago

@iscyy hello, I had met the same problem and wondered if it was the data set that caused the problem.

HOW TO REPRODUCE:

①When I use

python3 train.py --cfg myConfig.yaml --img 640 --batch 64 --epochs 100 --data VisDrone.yaml --weights yolov5s.pt

it training successfully but when val... the console output is as follows

File "/home/workspace/yoloair/models/Models/SwinTransformer.py", line 242, in forward
    x_windows = window_partition(shifted_x, self.window_size)  # nW*B, window_size, window_size, C
  File "/home/workspace/yoloair/models/Models/SwinTransformer.py", line 122, in window_partition
    x = x.view(B, H // window_size, window_size, W // window_size, window_size, C)
RuntimeError: shape '[128, 3, 8, 5, 8, 64]' is invalid for input of size 8257536

②When I change VisDrone.yaml to COCO128.yaml, like:

python3 train.py --cfg myConfig.yaml --img 640 --batch 64 --epochs 100 --data coco128.yaml --weights yolov5s.pt

Training and verification are normal.

VERSION

Neweast isccy-beta.

CONFIG

myConfig.yaml is as follows, only modify C3 to C3STR

nc: 10  # number of classes
depth_multiple: 0.33  # model depth multiple
width_multiple: 0.50  # layer channel multiple
anchors: 3

# YOLOv5 v6.0 backbone
backbone:
  # [from, number, module, args]
  [ [ -1, 1, Conv, [ 64, 6, 2, 2 ] ],  # 0-P1/2
    [ -1, 1, Conv, [ 128, 3, 2 ] ],  # 1-P2/4
    [ -1, 3, C3, [ 128 ] ],
    [ -1, 1, Conv, [ 256, 3, 2 ] ],  # 3-P3/8
    [ -1, 6, C3, [ 256 ] ],
    [ -1, 1, Conv, [ 512, 3, 2 ] ],  # 5-P4/16
    [ -1, 3, C3STR, [ 512 ] ],
    [ -1, 1, Conv, [ 1024, 3, 2 ] ],  # 7-P5/32
    [ -1, 3, C3, [ 1024 ] ],
    [ -1, 1, SPPF, [ 1024, 5 ] ],  # 9
  ]

# YOLOv5 v6.0 head
head:
  [ [ -1, 1, Conv, [ 512, 1, 1 ] ],
    [ -1, 1, nn.Upsample, [ None, 2, 'nearest' ] ],
    [ [ -1, 6 ], 1, Concat, [ 1 ] ],  # cat backbone P4
    [ -1, 3, C3, [ 512, False ] ],  # 13

    [ -1, 1, Conv, [ 256, 1, 1 ] ],
    [ -1, 1, nn.Upsample, [ None, 2, 'nearest' ] ],
    [ [ -1, 4 ], 1, Concat, [ 1 ] ],  # cat backbone P3
    [ -1, 3, C3, [ 256, False ] ],  # 17 (P3/8-small)

    [ -1, 1, Conv, [ 256, 3, 2 ] ],
    [ [ -1, 14 ], 1, Concat, [ 1 ] ],  # cat head P4
    [ -1, 3, C3, [ 512, False ] ],  # 20 (P4/16-medium)

    [ -1, 1, Conv, [ 512, 3, 2 ] ],
    [ [ -1, 10 ], 1, Concat, [ 1 ] ],  # cat head P5
    [ -1, 3, C3, [ 1024, False ] ],  # 23 (P5/32-large)

    [ [ 17, 20, 23 ], 1, Detect, [ nc, anchors ] ],  # Detect(P3, P4, P5)
  ]

OUTPUT

Starting training for 100 epochs...

     Epoch   gpu_mem       box       obj       cls    labels  img_size
      0/99     6.76G    0.1417    0.1173     0.055       493       640: 100%|██████████| 102/102 [01:35<00:00,  1.07it/s]                                                                                                                                                   
               Class     Images     Labels          P          R     mAP@.5 mAP@.5:.95:   0%|          | 0/5 [00:00<?, ?it/s]                                                                                                                                               
Traceback (most recent call last):
  File "/home/workspace/yoloair2/train.py", line 696, in <module>
    main(opt)
  File "/home/workspace/yoloair2/train.py", line 592, in main
    train(opt.hyp, opt, device, callbacks)
  File "/home/workspace/yoloair2/train.py", line 416, in train
    results, maps, _ = val.run(data_dict,
  File "/home/.local/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
    return func(*args, **kwargs)
  File "/home/workspace/yoloair2/val.py", line 196, in run
    out, train_out = model(im) if training else model(im, augment=augment, val=True)  # inference, loss outputs
  File "/home/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/workspace/yoloair2/models/yolo.py", line 166, in forward
    return self._forward_once(x, profile, visualize)  # single-scale inference, train
  File "/home/workspace/yoloair2/models/yolo.py", line 189, in _forward_once
    x = m(x)  # run
  File "/home/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/workspace/yoloair2/models/Models/muitlbackbone.py", line 486, in forward
    return self.cv3(torch.cat((self.m(self.cv1(x)), self.cv2(x)), dim=1))
  File "/home/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/workspace/yoloair2/models/Models/muitlbackbone.py", line 432, in forward
    x = self.blocks(x)
  File "/home/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/.local/lib/python3.10/site-packages/torch/nn/modules/container.py", line 204, in forward
    input = module(input)
  File "/home/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/workspace/yoloair2/models/Models/SwinTransformer.py", line 242, in forward
    x_windows = window_partition(shifted_x, self.window_size)  # nW*B, window_size, window_size, C
  File "/home/workspace/yoloair2/models/Models/SwinTransformer.py", line 122, in window_partition
    x = x.view(B, H // window_size, window_size, W // window_size, window_size, C)
RuntimeError: shape '[128, 3, 8, 5, 8, 64]' is invalid for input of size 8257536
Traceback (most recent call last):
  File "/home/workspace/yoloair2/train.py", line 696, in <module>
    main(opt)
  File "/home/workspace/yoloair2/train.py", line 592, in main
    train(opt.hyp, opt, device, callbacks)
  File "/home/workspace/yoloair2/train.py", line 416, in train
    results, maps, _ = val.run(data_dict,
  File "/home/.local/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
    return func(*args, **kwargs)
  File "/home/workspace/yoloair2/val.py", line 196, in run
    out, train_out = model(im) if training else model(im, augment=augment, val=True)  # inference, loss outputs
  File "/home/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/workspace/yoloair2/models/yolo.py", line 166, in forward
    return self._forward_once(x, profile, visualize)  # single-scale inference, train
  File "/home/workspace/yoloair2/models/yolo.py", line 189, in _forward_once
    x = m(x)  # run
  File "/home/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/workspace/yoloair2/models/Models/muitlbackbone.py", line 486, in forward
    return self.cv3(torch.cat((self.m(self.cv1(x)), self.cv2(x)), dim=1))
  File "/home/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/workspace/yoloair2/models/Models/muitlbackbone.py", line 432, in forward
    x = self.blocks(x)
  File "/home/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/.local/lib/python3.10/site-packages/torch/nn/modules/container.py", line 204, in forward
    input = module(input)
  File "/home/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/workspace/yoloair2/models/Models/SwinTransformer.py", line 242, in forward
    x_windows = window_partition(shifted_x, self.window_size)  # nW*B, window_size, window_size, C
  File "/home/workspace/yoloair2/models/Models/SwinTransformer.py", line 122, in window_partition
    x = x.view(B, H // window_size, window_size, W // window_size, window_size, C)
RuntimeError: shape '[128, 3, 8, 5, 8, 64]' is invalid for input of size 8257536

How did you solve it?

iscyy / yoloair