ultralytics / yolov5

YOLOv5 🚀 in PyTorch > ONNX > CoreML > TFLite
https://docs.ultralytics.com
GNU Affero General Public License v3.0
50.13k stars 16.19k forks source link

RuntimeError: shape '[3, -1, 2]' is invalid for input of size 3 #11731

Closed kalikhademi closed 1 year ago

kalikhademi commented 1 year ago

Search before asking

YOLOv5 Component

Training

Bug

I have used this tutorial to train yolov3 on custom dataset. I have used pretrained weights and yolov3.yaml in the repo as cfg file. When I run train.py on one GPU, I recieve the folllowing error: from n params module arguments
0 -1 1 928 models.common.Conv [3, 32, 3, 1]
1 -1 1 18560 models.common.Conv [32, 64, 3, 2]
2 -1 1 20672 models.common.Bottleneck [64, 64]
3 -1 1 73984 models.common.Conv [64, 128, 3, 2]
4 -1 2 164608 models.common.Bottleneck [128, 128]
5 -1 1 295424 models.common.Conv [128, 256, 3, 2]
6 -1 8 2627584 models.common.Bottleneck [256, 256]
7 -1 1 1180672 models.common.Conv [256, 512, 3, 2]
8 -1 8 10498048 models.common.Bottleneck [512, 512]
9 -1 1 4720640 models.common.Conv [512, 1024, 3, 2]
10 -1 4 20983808 models.common.Bottleneck [1024, 1024]
11 -1 1 5245952 models.common.Bottleneck [1024, 1024, False]
12 -1 1 525312 models.common.Conv [1024, 512, 1, 1]
13 -1 1 4720640 models.common.Conv [512, 1024, 3, 1]
14 -1 1 525312 models.common.Conv [1024, 512, 1, 1]
15 -1 1 4720640 models.common.Conv [512, 1024, 3, 1]
16 -2 1 131584 models.common.Conv [512, 256, 1, 1]
17 -1 1 0 torch.nn.modules.upsampling.Upsample [None, 2, 'nearest']
18 [-1, 8] 1 0 models.common.Concat [1]
19 -1 1 1377792 models.common.Bottleneck [768, 512, False]
20 -1 1 1312256 models.common.Bottleneck [512, 512, False]
21 -1 1 131584 models.common.Conv [512, 256, 1, 1]
22 -1 1 1180672 models.common.Conv [256, 512, 3, 1]
23 -2 1 33024 models.common.Conv [256, 128, 1, 1]
24 -1 1 0 torch.nn.modules.upsampling.Upsample [None, 2, 'nearest']
25 [-1, 6] 1 0 models.common.Concat [1]
26 -1 1 344832 models.common.Bottleneck [384, 256, False]
27 -1 2 656896 models.common.Bottleneck [256, 256, False]
/Object Detection/yolov3/data/miap.yaml Namespace(weights='yolov3_byol.pt', cfg='yolov3.yaml', data='/Object Detection/yolov3/data/miap.yaml', hyp='data/hyps/hyp.scratch-low.yaml', epochs=100, batch_size=16, imgsz=640, rect=False, resume=False, nosave=False, noval=False, noautoanchor=False, noplots=False, evolve=None, bucket='', cache=None, image_weights=False, device='', multi_scale=False, single_cls=False, optimizer='SGD', sync_bn=False, workers=8, project='runs/train', name='exp', exist_ok=False, quad=False, cos_lr=False, label_smoothing=0.0, patience=100, freeze=[0], save_period=-1, seed=0, local_rank=-1, entity=None, upload_dataset=False, bbox_interval=-1, artifact_alias='latest', save_dir='runs/train/exp18') Traceback (most recent call last): File "/Object Detection/yolov3/train.py", line 642, in main(opt) File "/Object Detection/yolov3/train.py", line 535, in main train(opt.hyp, opt, device, callbacks) File "/Object Detection/yolov3/train.py", line 127, in train model = Model(cfg or ckpt['model'].yaml, ch=3, nc=nc, anchors=hyp.get('anchors')).to(device) # create File "/Object Detection/yolov3/models/yolo.py", line 187, in init self.model, self.save = parse_model(deepcopy(self.yaml), ch=[ch]) # model, savelist File "/Object Detection/yolov3/models/yolo.py", line 354, in parsemodel m = nn.Sequential((m(args) for _ in range(n))) if n > 1 else m(*args) # module File "/Object Detection/yolov3/models/yolo.py", line 54, in init self.register_buffer('anchors', torch.tensor(anchors).float().view(self.nl, -1, 2)) # shape(nl,na,2) RuntimeError: shape '[3, -1, 2]' is invalid for input of size 3

Environment

YOLOv3 Os:Ubuntu Single A100 GPu Python 3.10, torch 2.0.1, cuda 11.7

Minimal Reproducible Example

No response

Additional

python train.py --data coco128.yaml --weights yolov3.pt --cfg yolov3.yaml --img 640

Are you willing to submit a PR?

github-actions[bot] commented 1 year ago

👋 Hello @kalikhademi, thank you for your interest in YOLOv5 🚀! Please visit our ⭐️ Tutorials to get started, where you can find quickstart guides for simple tasks like Custom Data Training all the way to advanced concepts like Hyperparameter Evolution.

If this is a 🐛 Bug Report, please provide a minimum reproducible example to help us debug it.

If this is a custom training ❓ Question, please provide as much information as possible, including dataset image examples and training logs, and verify you are following our Tips for Best Training Results.

Requirements

Python>=3.7.0 with all requirements.txt installed including PyTorch>=1.7. To get started:

git clone https://github.com/ultralytics/yolov5  # clone
cd yolov5
pip install -r requirements.txt  # install

Environments

YOLOv5 may be run in any of the following up-to-date verified environments (with all dependencies including CUDA/CUDNN, Python and PyTorch preinstalled):

Status

YOLOv5 CI

If this badge is green, all YOLOv5 GitHub Actions Continuous Integration (CI) tests are currently passing. CI tests verify correct operation of YOLOv5 training, validation, inference, export and benchmarks on macOS, Windows, and Ubuntu every 24 hours and on every commit.

Introducing YOLOv8 🚀

We're excited to announce the launch of our latest state-of-the-art (SOTA) object detection model for 2023 - YOLOv8 🚀!

Designed to be fast, accurate, and easy to use, YOLOv8 is an ideal choice for a wide range of object detection, image segmentation and image classification tasks. With YOLOv8, you'll be able to quickly and accurately detect objects in real-time, streamline your workflows, and achieve new levels of accuracy in your projects.

Check out our YOLOv8 Docs for details and get started with:

pip install ultralytics
glenn-jocher commented 1 year ago

@kalikhademi hi,

Thank you for reaching out. I understand that you are facing an issue when running train.py on one GPU for YOLOv3. After reviewing your error message, it seems that the shape '[3, -1, 2]' is invalid for an input of size 3.

To better understand and resolve this issue, could you please provide some additional information? Specifically, please provide the steps you followed and any modifications you made to the code or configuration files. Additionally, please confirm if you are running the latest versions of YOLOv5, Python, Torch, and CUDA.

Once I have this information, I will be able to assist you further in resolving this issue.

Thank you.

kalikhademi commented 1 year ago

Hi Glenn,

  1. I have cloned the repo of yolov3.
  2. Followed steps on this link(https://docs.ultralytics.com/yolov5/tutorials/train_custom_data/)
  3. command used "python train.py --data miap.yaml --weights yolov3_byol.pt --cfg yolov3.yaml --img 640"
  4. I did not make any changes in the code
  5. Python 3.10, torch 2.0.1, cuda 11.7, virtual environment with packages in requirements.txt of the repo

I have pretrained BYOL with YOLOv3 backbone(first 10 layers) and then used that to finetune my model to have weights for first 10 layers.

github-actions[bot] commented 1 year ago

👋 Hello there! We wanted to give you a friendly reminder that this issue has not had any recent activity and may be closed soon, but don't worry - you can always reopen it if needed. If you still have any questions or concerns, please feel free to let us know how we can help.

For additional resources and information, please see the links below:

Feel free to inform us of any other issues you discover or feature requests that come to mind in the future. Pull Requests (PRs) are also always welcomed!

Thank you for your contributions to YOLO 🚀 and Vision AI ⭐

glenn-jocher commented 11 months ago

@kalikhademi Thanks for the detailed information. It seems like the issue may be related to the use of the custom pre-trained weights and the finetuning process.

To better assist you, I recommend ensuring the compatibility of the pre-trained BYOL weights with the YOLOv3 model architecture, and that any modifications made during the finetuning process are consistent with the YOLOv3 configuration.

Additionally, I noticed that you are using Python 3.10, which may not be fully compatible with all dependencies. YOLOv5 currently supports up to Python 3.9, so I recommend using Python 3.9 for compatibility.

If the issue persists, you might consider adjusting the BYOL pretraining or the finetuning process to align it with the YOLOv3 model requirements.

I hope this helps, and please let me know if you have any further questions.