ultralytics / yolov5

YOLOv5 πŸš€ in PyTorch > ONNX > CoreML > TFLite
https://docs.ultralytics.com
GNU Affero General Public License v3.0
50.78k stars 16.36k forks source link

Yolov5-6.0 Specific Bug: The expanded size of the tensor (1) must match the existing size (4) at non-singleton dimension 3. Target sizes: [1, 3, 1, 1, 2]. Tensor sizes: [3, 4, 4, 2] #5234

Closed SpaceView closed 2 years ago

SpaceView commented 3 years ago

This is a bug specific to Yolov5-6.0; Yolov5-5.0 doesn't have this problem. How to Reproduce the bug,

Platform: Windows 10
Python: 3.9.7
Torch: 1.9.1

(step.1) in detect.py, change the following items,
def parse_opt():
    parser = argparse.ArgumentParser()
    #parser.add_argument('--weights', nargs='+', type=str, default=ROOT / 'weights/yolov5s.pt', help='model path(s)')
    parser.add_argument('--weights', nargs='+', type=str, default=ROOT / 'weights/yolov5n.pt', help='model path(s)')
    parser.add_argument('--source', type=str, default=ROOT / 'data/images/bus.jpg', help='file/dir/URL/glob, 0 for webcam')

(step.2) Open the project with vscode, the root directory is "d:/yolov5-master", 
By the way, I also tested with yolov5-6.0, root directory at "d:/yolov5-6.0", it gives the same error.

(step.3) Run the detect.py script in vscode

The error info is given as below

Exception has occurred: RuntimeError       (note: full exception trace is shown but execution is paused at: forward)
The expanded size of the tensor (1) must match the existing size (4) at non-singleton dimension 3.  Target sizes: [1, 3, 1, 1, 2].  Tensor sizes: [3, 4, 4, 2]
  File "D:\vsAI\yolov5-6.0\models\yolo.py", line 61, in forward (Current frame)
    self.grid[i], self.anchor_grid[i] = self._make_grid(nx, ny, i)
  File "D:\Anaconda3\envs\torch\Lib\site-packages\torch\nn\modules\module.py", line 1051, in _call_impl
    return forward_call(*input, **kwargs)
  File "D:\vsAI\yolov5-6.0\models\yolo.py", line 149, in _forward_once
    x = m(x)  # run
  File "D:\vsAI\yolov5-6.0\models\yolo.py", line 126, in forward
    return self._forward_once(x, profile, visualize)  # single-scale inference, train
  File "D:\Anaconda3\envs\torch\Lib\site-packages\torch\nn\modules\module.py", line 1051, in _call_impl
    return forward_call(*input, **kwargs)
  File "D:\Anaconda3\envs\torch\Lib\site-packages\thop\profile.py", line 188, in profile
    model(*inputs)
  File "D:\vsAI\yolov5-6.0\utils\torch_utils.py", line 236, in model_info
    flops = profile(deepcopy(model), inputs=(img,), verbose=False)[0] / 1E9 * 2  # stride GFLOPs
  File "D:\vsAI\yolov5-6.0\models\yolo.py", line 235, in info
    model_info(self, verbose, img_size)
  File "D:\vsAI\yolov5-6.0\models\yolo.py", line 225, in fuse
    self.info()
  File "D:\vsAI\yolov5-6.0\models\experimental.py", line 96, in attempt_load
    model.append(ckpt['ema' if ckpt.get('ema') else 'model'].float().fuse().eval())  # FP32 model
  File "D:\vsAI\yolov5-6.0\detect.py", line 82, in run
    model = torch.jit.load(w) if 'torchscript' in w else attempt_load(weights, map_location=device)
  File "D:\Anaconda3\envs\torch\Lib\site-packages\torch\autograd\grad_mode.py", line 28, in decorate_context
    return func(*args, **kwargs)
  File "D:\vsAI\yolov5-6.0\detect.py", line 302, in main
    run(**vars(opt))
  File "D:\vsAI\yolov5-6.0\detect.py", line 307, in <module>
    main(opt)
  File "D:\Anaconda3\envs\torch\Lib\runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "D:\Anaconda3\envs\torch\Lib\runpy.py", line 97, in _run_module_code
    _run_code(code, mod_globals, init_globals,
  File "D:\Anaconda3\envs\torch\Lib\runpy.py", line 268, in run_path
    return _run_module_code(code, init_globals, run_name,
  File "D:\Anaconda3\envs\torch\Lib\runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "D:\Anaconda3\envs\torch\Lib\runpy.py", line 197, in _run_module_as_main
    return _run_code(code, main_globals, None,

it seems that the following item has some problem,

self.grid[i], self.anchor_grid[i] = self._make_grid(nx, ny, i)

I use the following equivalent code to debug it

tmp_grid, tmp_anchor_grid =  self._make_grid(nx, ny, i)
self.grid[i] = tmp_grid
self.anchor_grid[i] = tmp_anchor_grid

and found that when i==0: self.anchor_grid[0].shape -- >torch.Size([1, 3, 1, 1, 2]) tmp_anchor_grid.shape -- > torch.Size([1, 3, 4, 4, 2])

The problem seems coming from the thop.profile,

flops = profile(deepcopy(model), inputs=(img,), verbose=False)[0] / 1E9 * 2  # stride GFLOPS

Currently I have no idea how these come out to be so, where is the self.anchor_grid[0] coming from?

When I run the script in windows powershell command console, I got no such a bug a, as below,

$ python detect.py --source ./data/images/bus.jpg
detect: weights=yolov5s.pt, source=./data/images/bus.jpg, imgsz=[640, 640], conf_thres=0.2
0, device=, view_img=False, save_txt=False, save_conf=False, save_crop=False, nosave=False
alse, augment=False, visualize=False, update=False, project=runs\detect, name=exp, exist_o
e_labels=False, hide_conf=False, half=False, dnn=False
YOLOv5  2021-10-15 torch 1.9.1 CUDA:0 (GeForce GTX 1080 Ti, 11264.0MB)

Fusing layers...
Model Summary: 213 layers, 7225885 parameters, 0 gradients
image 1/1 D:\vsAI\yolov5-master\data\images\bus.jpg: 640x480 4 persons, 1 bus, Done. (0.00
Speed: 1.0ms pre-process, 8.0ms inference, 5.0ms NMS per image at shape (1, 3, 640, 640)
github-actions[bot] commented 3 years ago

πŸ‘‹ Hello @SpaceView, thank you for your interest in YOLOv5 πŸš€! Please visit our ⭐️ Tutorials to get started, where you can find quickstart guides for simple tasks like Custom Data Training all the way to advanced concepts like Hyperparameter Evolution.

If this is a πŸ› Bug Report, please provide screenshots and minimum viable code to reproduce your issue, otherwise we can not help you.

If this is a custom training ❓ Question, please provide as much information as possible, including dataset images, training logs, screenshots, and a public link to online W&B logging if available.

For business inquiries or professional support requests please visit https://ultralytics.com or email Glenn Jocher at glenn.jocher@ultralytics.com.

Requirements

Python>=3.6.0 with all requirements.txt installed including PyTorch>=1.7. To get started:

$ git clone https://github.com/ultralytics/yolov5
$ cd yolov5
$ pip install -r requirements.txt

Environments

YOLOv5 may be run in any of the following up-to-date verified environments (with all dependencies including CUDA/CUDNN, Python and PyTorch preinstalled):

Status

CI CPU testing

If this badge is green, all YOLOv5 GitHub Actions Continuous Integration (CI) tests are currently passing. CI tests verify correct operation of YOLOv5 training (train.py), validation (val.py), inference (detect.py) and export (export.py) on MacOS, Windows, and Ubuntu every 24 hours and on every commit.

glenn-jocher commented 3 years ago

@SpaceView thanks for the bug report. This might just be due to out of date code or models. I tested this locally in PyCharm MacOS with python 3.9 and everything seems fine:

Screen Shot 2021-10-19 at 1 48 29 PM

The CI tests regularly run YOLOv5n with all main functions (train, val, detect, export) on Windows also and they are green currently: https://github.com/ultralytics/yolov5/runs/3937706191?check_suite_focus=true

glenn-jocher commented 3 years ago

@fcakyon @SpaceView I'm not able to reproduce any error here. The following two examples execute correctly in Colab.

!python train.py --img 640 --batch 16 --epochs 3 --data coco128.yaml --weights yolov5n.pt
!python detect.py --weights runs/train/exp/weights/best.pt

!python train.py --img 640 --batch 16 --epochs 3 --data coco128.yaml --weights '' --cfg yolov5n.yaml
!python detect.py --weights runs/train/exp2/weights/best.pt

Response from detect.py calls is:

detect: weights=['runs/train/exp/weights/best.pt'], source=data/images, imgsz=[640, 640], conf_thres=0.25, iou_thres=0.45, max_det=1000, device=, view_img=False, save_txt=False, save_conf=False, save_crop=False, nosave=False, classes=None, agnostic_nms=False, augment=False, visualize=False, update=False, project=runs/detect, name=exp, exist_ok=False, line_thickness=3, hide_labels=False, hide_conf=False, half=False, dnn=False
YOLOv5 πŸš€ v6.0-23-ga18b0c3 torch 1.9.0+cu111 CUDA:0 (Tesla P100-PCIE-16GB, 16280.875MB)

Fusing layers... 
Model Summary: 213 layers, 1867405 parameters, 0 gradients, 4.5 GFLOPs
image 1/2 /content/yolov5/data/images/bus.jpg: 640x480 4 persons, 1 bus, 1 skateboard, Done. (0.015s)
image 2/2 /content/yolov5/data/images/zidane.jpg: 384x640 2 persons, 1 tie, Done. (0.016s)
Speed: 0.4ms pre-process, 15.3ms inference, 1.6ms NMS per image at shape (1, 3, 640, 640)
Results saved to runs/detect/exp

detect: weights=['runs/train/exp2/weights/best.pt'], source=data/images, imgsz=[640, 640], conf_thres=0.25, iou_thres=0.45, max_det=1000, device=, view_img=False, save_txt=False, save_conf=False, save_crop=False, nosave=False, classes=None, agnostic_nms=False, augment=False, visualize=False, update=False, project=runs/detect, name=exp, exist_ok=False, line_thickness=3, hide_labels=False, hide_conf=False, half=False, dnn=False
YOLOv5 πŸš€ v6.0-23-ga18b0c3 torch 1.9.0+cu111 CUDA:0 (Tesla P100-PCIE-16GB, 16280.875MB)

Fusing layers... 
Model Summary: 213 layers, 1867405 parameters, 0 gradients, 4.5 GFLOPs
image 1/2 /content/yolov5/data/images/bus.jpg: 640x480 Done. (0.016s)
image 2/2 /content/yolov5/data/images/zidane.jpg: 384x640 Done. (0.017s)
Speed: 0.4ms pre-process, 16.4ms inference, 0.4ms NMS per image at shape (1, 3, 640, 640)
Results saved to runs/detect/exp2

We've created a few short guidelines below to help users provide what we need in order to get started investigating a possible problem.

How to create a Minimal, Reproducible Example

When asking a question, people will be better able to provide help if you provide code that they can easily understand and use to reproduce the problem. This is referred to by community members as creating a minimum reproducible example. Your code that reproduces the problem should be:

In addition to the above requirements, for Ultralytics to provide assistance your code should be:

If you believe your problem meets all of the above criteria, please close this issue and raise a new one using the πŸ› Bug Report template and providing a minimum reproducible example to help us better understand and diagnose your problem.

Thank you! πŸ˜ƒ

glenn-jocher commented 3 years ago

I also trained a new model from a custom trained model (exp2/weights/best.pt), and detecting again with the new exp3/weights/best.pt, everything worked correctly:

!python train.py --img 640 --batch 16 --epochs 3 --data coco128.yaml --weights runs/train/exp2/weights/best.pt
!python detect.py --weights runs/train/exp3/weights/best.pt
jebastin-nadar commented 3 years ago

Hi @SpaceView @fcakyon, yes the bug originates from my PR. I have tried to reproduce the error with pre-trained and custom-trained yolov5n from scratch (similar code as @glenn-jocher), but detect.py works correctly with both models.


self.anchor_grid is supposed to be a list of Tensors, but from the error message, it looks like self.anchor_grid is a Tensor (it was a Tensor before my PR was merged) and assigning a Tensor of different shape is raising this error.
You can check this by adding print(type(self.anchor_grid)) in forward() of Detect module.

This conversion of Tensor to list of Tensors is done in attempt_load()

https://github.com/ultralytics/yolov5/blob/a18b0c36cd4df0d3b9c2623c5dda009c5f281ac9/models/experimental.py#L106-L108

and I see that this function is being called during runtime

File "D:\vsAI\yolov5-6.0\models\experimental.py", line 96, in attempt_load

Compatibility with models trained before my PR was checked before merging it, so it's quite strange to see this bug. As suggested by Glenn, some more reproducer code/models are needed.

fcakyon commented 3 years ago

@glenn-jocher @SamFC10 the error is raised when a model trained on 5.0 source is used with detect.py from 6.0 source. Compatibility addition seems to be not working for some reason.

jebastin-nadar commented 3 years ago

@fcakyon Please add a link to your trained model if possible. Some edge case is being missed.

fcakyon commented 3 years ago

I cannot add it for privacy reasons, will try to train a redundant model for reproducability.

SpaceView commented 3 years ago

@glenn-jocher @fcakyon @SamFC10 Great thanks for your attention, I use "vscode" in windows 10. If you don't use it, it may pass the thop.profile without any warning, so you need to add some additional info to reproduce this bug, as below:

def model_info(model, verbose=False, img_size=640):
    # Model information. img_size may be int or list, i.e. img_size=640 or img_size=[640, 320]
    n_p = sum(x.numel() for x in model.parameters())  # number parameters
    n_g = sum(x.numel() for x in model.parameters() if x.requires_grad)  # number gradients
    if verbose:
        print('%5s %40s %9s %12s %20s %10s %10s' % ('layer', 'name', 'gradient', 'parameters', 'shape', 'mu', 'sigma'))
        for i, (name, p) in enumerate(model.named_parameters()):
            name = name.replace('module_list.', '')
            print('%5g %40s %9s %12g %20s %10.3g %10.3g' %
                  (i, name, p.requires_grad, p.numel(), list(p.shape), p.mean(), p.std()))

    try:  # FLOPs
        from thop import profile
        stride = max(int(model.stride.max()), 32) if hasattr(model, 'stride') else 32
        img = torch.zeros((1, model.yaml.get('ch', 3), stride, stride), device=next(model.parameters()).device)  # input
        print('Now it is time to show the bug, -------------------> for debug purpose \n')  # 
        flops = profile(deepcopy(model), inputs=(img,), verbose=False)[0] / 1E9 * 2  # stride GFLOPs
        print('Can we print this out correctly?--- if NOT, here it is a problem, -------------------> for debug purpose\n')   
        img_size = img_size if isinstance(img_size, list) else [img_size, img_size]  # expand if int/float
        fs = ', %.1f GFLOPs' % (flops * img_size[0] / stride * img_size[1] / stride)  # 640x640 GFLOPs
    except (ImportError, Exception):
        fs = ''

    LOGGER.info(f"Model Summary: {len(list(model.modules()))} layers, {n_p} parameters, {n_g} gradients{fs}")

As you can see, in the model_info, I add 2 "print"s for debug. If this thop.profile works correctly, the 2 lines should print out correctly.

My output log is given as follows, you can see that only the first debug line is shown, while the second line is not, which means the thop.profile is by-passed by internal error break from python, consequently causing the coming lines un-excuted.

(torch) PS D:\vsAI\yolov5-6.0>  d:; cd 'd:\vsAI\yolov5-6.0'; & 'D:\Anaconda3\envs\torch\python.exe' 'c:\Users\Administrator\.vscode\extensions\ms-python.python-2021.10.1336267007\pythonFiles\lib\python\debugpy\launcher' '58690' '--' 'd:\vsAI\yolov5-6.0\detect.py' 
detect: weights=weights\yolov5n.pt, source=data\images\bus.jpg, imgsz=[640, 640], conf_thres=0.25, iou_thres=0.45, max_det=1000, device=, view_img=False, save_txt=False, save_conf=False, save_crop=False, nosave=False, classes=None, agnostic_nms=False, augment=False, visualize=False, update=False, project=runs\detect, name=exp, exist_ok=False, line_thickness=3, hide_labels=False, hide_conf=False, half=False, dnn=False    
YOLOv5  2021-10-20 torch 1.9.1 CUDA:0 (GeForce GTX 1080 Ti, 11264.0MB)

Fusing layers... 
Now it is time to show the bug, -------------------> for debug purpose 

Model Summary: 213 layers, 1867405 parameters, 0 gradients
attemp_load_done
image 1/1 D:\vsAI\yolov5-6.0\data\images\bus.jpg: debug
640x480 4 persons, 1 bus, 1 skateboard, Done. (0.026s)
Speed: 2.0ms pre-process, 26.0ms inference, 9.0ms NMS per image at shape (1, 3, 640, 640)

It is easy, you can check it as I did.

I will look further into this problem in the next couple of days if I have time, from training to evaluation. I suppose it is caused by some mismatch of the anchor_grid setting somewhere, and it seems the thop.profile can accept tensor.expansion, but not direct tensor replacement. NO idea why this happens in PYTHON. It seems the issue #4833 has caused this problem.

By the way, I use the yolov5-6.0 model and 5.0 model from your release archive, they give the same results.

SpaceView commented 3 years ago

I may have find out the reason, the error has something to do with Python's intrinsic tensor expansion mechanism (dimension matching), @fcakyon is right,

@glenn-jocher @SamFC10 the error is raised when a model trained on 5.0 source is used with detect.py from 6.0 source. Compatibility addition seems to be not working for some reason.

I use the latest code and had a short training, the error disappeared when using the my trained results. If I use the downloaded model (e.g. Yolov5n.pt), the error pops up.

jebastin-nadar commented 3 years ago

the error is raised when a model trained on 5.0 source is used

@SpaceView As I've mentioned above, please add links to your trained model if possible, so that the error can be reproduced from my side and debugged.

RaZzzyz commented 3 years ago

I meet this problem when I try the simple example in https://docs.ultralytics.com/tutorials/pytorch-hub/. I use the 6.0 yolov5s.pt

jebastin-nadar commented 3 years ago

@RaZzzyz Cannot reproduce the bug using the simple example mentioned in the link. I'm using Google Colab with the latest branch and model.

yolov5-reproducer

SpaceView commented 3 years ago

the error is raised when a model trained on 5.0 source is used

@SpaceView As I've mentioned above, please add links to your trained model if possible, so that the error can be reproduced from my side and debugged.

@SamFC10 Please read my answer slowly, I have supply all the infor you need, the model is from ultralytics, e.g.

https://github.com/ultralytics/yolov5/releases/download/v6.0/yolov5s.pt
https://github.com/ultralytics/yolov5/releases/download/v6.0/yolov5n.pt

To reproduce the issue please read my 2nd previous answer, surely you cannot print those 2 lines at the same time if you use old trained model, though no exception is raised.

I suppose this issue can be closed. If you train the model using the latest code, there will be no problem.

yamand16 commented 3 years ago

Hi all,

I am getting the same error. All details that @SpaceView and @SamFC10 mentioned are almost the same for me. I did not train my own model. I'm just trying to run the existing model. And torch.load row (self.grid[i], self.anchor_grid[i] = self._make_grid(nx, ny, i)) throws an error like "RuntimeError: The expanded size of the tensor (1) must match the existing size (80) at non-singleton dimension 3. Target sizes: [1, 3, 1, 1, 2]. Tensor sizes: [3, 48, 80, 2]".

By the way, I tried both 5.0 and 6.0 pretrained models.

glenn-jocher commented 3 years ago

@yamand16 πŸ‘‹ hi, thanks for letting us know about this possible problem with YOLOv5 πŸš€. We've created a few short guidelines below to help users provide what we need in order to get started investigating a possible problem.

How to create a Minimal, Reproducible Example

When asking a question, people will be better able to provide help if you provide code that they can easily understand and use to reproduce the problem. This is referred to by community members as creating a minimum reproducible example. Your code that reproduces the problem should be:

For Ultralytics to provide assistance your code should also be:

If you believe your problem meets all the above criteria, please close this issue and raise a new one using the πŸ› Bug Report template with a minimum reproducible example to help us better understand and diagnose your problem.

Thank you! πŸ˜ƒ

github-actions[bot] commented 2 years ago

πŸ‘‹ Hello, this issue has been automatically marked as stale because it has not had recent activity. Please note it will be closed if no further activity occurs.

Access additional YOLOv5 πŸš€ resources:

Access additional Ultralytics ⚑ resources:

Feel free to inform us of any other issues you discover or feature requests that come to mind in the future. Pull Requests (PRs) are also always welcomed!

Thank you for your contributions to YOLOv5 πŸš€ and Vision AI ⭐!

atremblay-rayhawk commented 2 years ago

I wanted to chime in here that I as well ran into this issue. I wanted to wait until we updated to the most recent set of code hoping it would be resolved but unfortunately not.

We've had to temporary patch this call:

if self.onnx_dynamic or self.grid[i].shape[2:4] != x[i].shape[2:4]:
   self.grid[i], self.anchor_grid[i] = self._make_grid(nx, ny, i)

to

if self.onnx_dynamic or self.grid[i].shape[2:4] != x[i].shape[2:4]:
   self.grid[i] = self._make_grid(nx, ny).to(x[i].device)

and

if self.inplace:
    y[..., 0:2] = (y[..., 0:2] * 2 - 0.5 + self.grid[i]) * self.stride[i]  # xy
    y[..., 2:4] = (y[..., 2:4] * 2) ** 2 * self.anchor_grid[i]  # wh
else:  # for YOLOv5 on AWS Inferentia https://github.com/ultralytics/yolov5/pull/2953
    xy = (y[..., 0:2] * 2 - 0.5 + self.grid[i]) * self.stride[i]  # xy
    wh = (y[..., 2:4] * 2) ** 2 * self.anchor_grid[i]  # wh

to

if self.inplace:
    y[..., 0:2] = (y[..., 0:2] * 2 - 0.5 + self.grid[i]) * self.stride[i]  # xy
    y[..., 2:4] = (y[..., 2:4] * 2) ** 2 * self.anchor_grid[i]  # wh
else:  # for YOLOv5 on AWS Inferentia https://github.com/ultralytics/yolov5/pull/2953
    xy = (y[..., 0:2] * 2 - 0.5 + self.grid[i]) * self.stride[i]  # xy
    wh = (y[..., 2:4] * 2) ** 2 * self.anchor_grid[i].view(1, self.na, 1, 1, 2)

and then revert the _make_grid function back to:

@staticmethod
def _make_grid(nx=20, ny=20):
    yv, xv = torch.meshgrid([torch.arange(ny), torch.arange(nx)])
    return torch.stack((xv, yv), 2).view((1, 1, ny, nx, 2)).float()

And everything works as expected. If not, we get the same error that has been listed before.

glenn-jocher commented 2 years ago

@atremblay-rayhawk hi, thanks you for your fix suggestion on how to improve YOLOv5 πŸš€!

The fastest and easiest way to incorporate your ideas into the official codebase is to submit a Pull Request (PR) implementing your idea, and if applicable providing before and after profiling/inference/training results to help us understand the improvement your feature provides. This allows us to directly see the changes in the code and to understand how they affect workflows and performance.

Please see our βœ… Contributing Guide to get started.

gg22mm commented 2 years ago

This should be because it is not supported now.

1

model = torch.load('./weights/yolov5s.pt', map_location=device)['model'].float() # load to FP32

2 That's all:

model= DetectMultiBackend('./weights/yolov5s.pt', device=device, dnn=False) #this is OK !


But I prefer the first one. I don't want to do so complex encapsulation

glenn-jocher commented 2 years ago

@gg22mm YOLOv5 models can be loaded any way you want. Your problem is not reproducible:

Screen Shot 2022-01-08 at 11 40 06 AM

How to create a Minimal, Reproducible Example

When asking a question, people will be better able to provide help if you provide code that they can easily understand and use to reproduce the problem. This is referred to by community members as creating a minimum reproducible example. Your code that reproduces the problem should be:

For Ultralytics to provide assistance your code should also be:

If you believe your problem meets all the above criteria, please close this issue and raise a new one using the πŸ› Bug Report template with a minimum reproducible example to help us better understand and diagnose your problem.

Thank you! πŸ˜ƒ

deepxiaobai commented 2 years ago

I am getting the same error.

models\yolo.py line 59 self.grid[i], self.anchor_grid[i] = self._make_grid(nx, ny, i) RuntimeError: The expanded size of the tensor (1) must match the existing size (80) at non-singleton dimension 3. Target sizes: [1, 3, 1, 1, 2]. Tensor sizes: [3, 48, 80, 2]

glenn-jocher commented 2 years ago

@deepxiaobai πŸ‘‹ hi, thanks for letting us know about this possible problem with YOLOv5 πŸš€. We've created a few short guidelines below to help users provide what we need in order to get started investigating a possible problem.

How to create a Minimal, Reproducible Example

When asking a question, people will be better able to provide help if you provide code that they can easily understand and use to reproduce the problem. This is referred to by community members as creating a minimum reproducible example. Your code that reproduces the problem should be:

For Ultralytics to provide assistance your code should also be:

If you believe your problem meets all the above criteria, please close this issue and raise a new one using the πŸ› Bug Report template with a minimum reproducible example to help us better understand and diagnose your problem.

Thank you! πŸ˜ƒ

ozett commented 2 years ago

i cannot help with code or analysis, but here is a model wich gives me the same error in the doods2 environment.

maybe someone needs such a model for further testing?

https://github.com/OlafenwaMoses/DeepStack_OpenLogo/releases/download/v1/openlogo.pt

glenn-jocher commented 2 years ago

@ozett πŸ‘‹ hi, thanks for letting us know about this possible problem with YOLOv5 πŸš€. We've created a few short guidelines below to help users provide what we need in order to get started investigating a possible problem.

How to create a Minimal, Reproducible Example

When asking a question, people will be better able to provide help if you provide code that they can easily understand and use to reproduce the problem. This is referred to by community members as creating a minimum reproducible example. Your code that reproduces the problem should be:

For Ultralytics to provide assistance your code should also be:

If you believe your problem meets all the above criteria, please close this issue and raise a new one using the πŸ› Bug Report template with a minimum reproducible example to help us better understand and diagnose your problem.

Thank you! πŸ˜ƒ

billalkuet07 commented 2 years ago

I have trained a model with v5.0, saved the model and trying to load with v6.1. I am getting following error :

File "/workercode/./yolov5/models/common.py", line 439, in forward y = self.model(im, augment=augment, visualize=visualize)[0] File "/usr/local/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl return forward_call(*input, *kwargs) File "/workercode/./yolov5/models/yolo.py", line 137, in forward return self._forward_once(x, profile, visualize) # single-scale inference, train File "/workercode/./yolov5/models/yolo.py", line 160, in _forward_once x = m(x) # run File "/usr/local/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl return forward_call(input, **kwargs) File "/workercode/./yolov5/models/yolo.py", line 65, in forward self.grid[i], self.anchor_grid[i] = self._make_grid(nx, ny, i) RuntimeError: The expanded size of the tensor (1) must match the existing size (20) at non-singleton dimension 3. Target sizes: [1, 3, 1, 1, 2]. Tensor sizes: [3, 20, 20, 2]

Is there any sugegssion that can help me???

glenn-jocher commented 2 years ago

Train a new model with the latest code.

JAYANTH-MOHAN commented 1 year ago

Yeah i got the same error. However corrected it Here are the steps to correct --> 1.) make sure u cloned master branch 2.) take models weights from latest yolo5 , Never put previous yolo versions weights(.pt file) to latest , it gives non-singleton dimension 3 error . This is how i corrected my error All the Best

glenn-jocher commented 12 months ago

@JAYANTH-MOHAN thanks for sharing your solution! This will be helpful for others who encounter similar issues. If you have any other questions or need further assistance, feel free to ask. Good luck with your YOLOv5 project!