Sizes of tensors must match except in dimension 1. Expected size 26 but got size 25 for tensor number 1 in the list.

Michelvl92 commented 2 years ago

Search before asking

[X] I have searched the YOLOv5 issues and found no similar bug report.

YOLOv5 Component

When loading any model with torch.hub, default ones, and custom models, I always getting the same error when create_graph_from_pytorch_model with the ReceptiveFieldAnalysisToolbox

Bug

Using cache found in /root/.cache/torch/hub/ultralytics_yolov5_master
YOLOv5 🚀 2022-1-27 torch 1.10.1+cu113 CUDA:0 (NVIDIA GeForce RTX 3090, 24266MiB)

Fusing layers... 
Model Summary: 213 layers, 7225885 parameters, 0 gradients
Adding AutoShape... 
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
/tmp/ipykernel_2611/173803600.py in <module>
      1 model = torch.hub.load('ultralytics/yolov5', 'yolov5s')
----> 2 graph = create_graph_from_pytorch_model(model)

/opt/conda/lib/python3.8/site-packages/rfa_toolbox/encodings/pytorch/ingest_architecture.py in create_graph_from_model(model, input_res)
    288         The EnrichedNetworkNodeGraph
    289     """
--> 290     tm = torch.jit.trace(model, (torch.randn(*input_res),))
    291     return make_graph(tm, ref_mod=model).to_graph()

/opt/conda/lib/python3.8/site-packages/torch/jit/_trace.py in trace(func, example_inputs, optimize, check_trace, check_inputs, check_tolerance, strict, _force_outplace, _module_class, _compilation_unit)
    739 
    740     if isinstance(func, torch.nn.Module):
--> 741         return trace_module(
    742             func,
    743             {"forward": example_inputs},

/opt/conda/lib/python3.8/site-packages/torch/jit/_trace.py in trace_module(mod, inputs, optimize, check_trace, check_inputs, check_tolerance, strict, _force_outplace, _module_class, _compilation_unit)
    956             example_inputs = make_tuple(example_inputs)
    957 
--> 958             module._c._create_method_from_trace(
    959                 method_name,
    960                 func,

/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
   1100         if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
   1101                 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1102             return forward_call(*input, **kwargs)
   1103         # Do not call functions when jit is used
   1104         full_backward_hooks, non_full_backward_hooks = [], []

/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py in _slow_forward(self, *input, **kwargs)
   1088                 recording_scopes = False
   1089         try:
-> 1090             result = self.forward(*input, **kwargs)
   1091         finally:
   1092             if recording_scopes:

/opt/conda/lib/python3.8/site-packages/torch/autograd/grad_mode.py in decorate_context(*args, **kwargs)
     26         def decorate_context(*args, **kwargs):
     27             with self.__class__():
---> 28                 return func(*args, **kwargs)
     29         return cast(F, decorate_context)
     30 

~/.cache/torch/hub/ultralytics_yolov5_master/models/common.py in forward(self, imgs, size, augment, profile)
    508         if isinstance(imgs, torch.Tensor):  # torch
    509             with amp.autocast(enabled=autocast):
--> 510                 return self.model(imgs.to(p.device).type_as(p), augment, profile)  # inference
    511 
    512         # Pre-process

/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
   1100         if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
   1101                 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1102             return forward_call(*input, **kwargs)
   1103         # Do not call functions when jit is used
   1104         full_backward_hooks, non_full_backward_hooks = [], []

/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py in _slow_forward(self, *input, **kwargs)
   1088                 recording_scopes = False
   1089         try:
-> 1090             result = self.forward(*input, **kwargs)
   1091         finally:
   1092             if recording_scopes:

~/.cache/torch/hub/ultralytics_yolov5_master/models/common.py in forward(self, im, augment, visualize, val)
    397         b, ch, h, w = im.shape  # batch, channel, height, width
    398         if self.pt or self.jit:  # PyTorch
--> 399             y = self.model(im) if self.jit else self.model(im, augment=augment, visualize=visualize)
    400             return y if val else y[0]
    401         elif self.dnn:  # ONNX OpenCV DNN

/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
   1100         if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
   1101                 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1102             return forward_call(*input, **kwargs)
   1103         # Do not call functions when jit is used
   1104         full_backward_hooks, non_full_backward_hooks = [], []

/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py in _slow_forward(self, *input, **kwargs)
   1088                 recording_scopes = False
   1089         try:
-> 1090             result = self.forward(*input, **kwargs)
   1091         finally:
   1092             if recording_scopes:

~/.cache/torch/hub/ultralytics_yolov5_master/models/yolo.py in forward(self, x, augment, profile, visualize)
    124         if augment:
    125             return self._forward_augment(x)  # augmented inference, None
--> 126         return self._forward_once(x, profile, visualize)  # single-scale inference, train
    127 
    128     def _forward_augment(self, x):

~/.cache/torch/hub/ultralytics_yolov5_master/models/yolo.py in _forward_once(self, x, profile, visualize)
    147             if profile:
    148                 self._profile_one_layer(m, x, dt)
--> 149             x = m(x)  # run
    150             y.append(x if m.i in self.save else None)  # save output
    151             if visualize:

/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
   1100         if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
   1101                 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1102             return forward_call(*input, **kwargs)
   1103         # Do not call functions when jit is used
   1104         full_backward_hooks, non_full_backward_hooks = [], []

/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py in _slow_forward(self, *input, **kwargs)
   1088                 recording_scopes = False
   1089         try:
-> 1090             result = self.forward(*input, **kwargs)
   1091         finally:
   1092             if recording_scopes:

~/.cache/torch/hub/ultralytics_yolov5_master/models/common.py in forward(self, x)
    273 
    274     def forward(self, x):
--> 275         return torch.cat(x, self.d)
    276 
    277 

RuntimeError: Sizes of tensors must match except in dimension 1. Expected size 26 but got size 25 for tensor number 1 in the list.

Environment

Default Yolov5 latest docker

Minimal Reproducible Example

! pip install rfa_toolbox
import torch
from rfa_toolbox import create_graph_from_pytorch_model, visualize_architecture

model = torch.hub.load('ultralytics/yolov5', 'yolov5s')
graph = create_graph_from_pytorch_model(model)

Additional

No response

Are you willing to submit a PR?

[X] Yes I'd like to help by submitting a PR!

MLRichter commented 2 years ago

This is caused by some contraflow within the forward-pass of YoloV5. RFA-Toolbox uses the JIT-Compiler to extract the graph of a model, which only evaluated the parts of the control flow touched by the forward-pass during the trace. For some reason this requires any YoloV5-model to be in train-mode in order to be traceable by the JIT-Compiler of PyTorch.

If you put the model in train mode it works. Here is some example code, that fixed the issue:


import torch
from rfa_toolbox import create_graph_from_pytorch_model, visualize_architecture

# Model
model_name = "YoloV5s"
model = torch.hub.load('ultralytics/yolov5', f'{model_name.lower()}')  # or yolov5m, yolov5l, yolov5x, custom
#model = torch.hub.load('ultralytics/yolov5', 'custom', 'yolov5s.onnx')  # or yolov5m, yolov5l, yolov5x, custom
model.train()
graph = create_graph_from_pytorch_model(model.cpu(), (4, 3, 640, 640))
visualize_architecture(graph, model_name=f"{model_name}", input_res=10000).render(f"{model_name}")

glenn-jocher commented 2 years ago

@Michelvl92 errors generated by 3rd party tools should be reported directly to their authors, they are outside of the scope of our work and support.

pranavraja99 commented 2 years ago

@Michelvl92 errors generated by 3rd party tools should be reported directly to their authors, they are outside of the scope of our work and support.

ya but if you're going to use 3rd party tools in your work use them properly. I've been spending an hour over this ridiculous issue

glenn-jocher commented 2 years ago

@pranavraja99 OP's tools are not related to YOLOv5, they are not used in this repository

ultralytics / yolov5