ultralytics / yolov3

YOLOv3 in PyTorch > ONNX > CoreML > TFLite
https://docs.ultralytics.com
GNU Affero General Public License v3.0
10.22k stars 3.45k forks source link

RuntimeError #948

Closed msadoun closed 4 years ago

msadoun commented 4 years ago

Ran train.py and it shows: [python train.py --data data/coco.data --cfg cfg/yolov3.cfg]

`C:\Users\msadoun\Desktop\YOLOv3---2>python train.py --data data/coco.data --cfg cfg/yolov3-spp.cfg Namespace(accumulate=4, adam=False, batch_size=16, bucket='', cache_images=False, cfg='cfg/yolov3-spp.cfg', data='data/coco.data', device='', epochs=300, evolve=False, img_size=[416], multi_scale=False, name='', nosave=False, notest=False, rect=False, resume=False, single_cls=False, var=None, weights='weights/yolov3-spp-ultralytics.pt') Using CPU

2020-03-20 13:58:52.696971: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'cudart64_100.dll'; dlerror: cudart64_100.dll not found 2020-03-20 13:58:52.700320: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine. Model Summary: 225 layers, 6.27927e+07 parameters, 6.27927e+07 gradients Caching labels (13 found, 0 missing, 0 empty, 0 duplicate, for 13 images): 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 13/13 [00:00<00:00, 224.26it/s] Caching labels (13 found, 0 missing, 0 empty, 0 duplicate, for 13 images): 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 13/13 [00:00<00:00, 220.46it/s] Using 0 dataloader workers Starting training for 300 epochs...

 Epoch   gpu_mem      GIoU       obj       cls     total   targets  img_size
 0/299        0G      7.26       107      15.7       130  9.42e+03       416: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 13/13 [01:13<00:00,  5.67s/it]
           Class    Images   Targets         P         R   mAP@0.5        F1:   0%|                                                                                                                               | 0/7 [00:09<?, ?it/s]

Traceback (most recent call last): File "train.py", line 433, in train() # train normally File "train.py", line 321, in train dataloader=testloader) File "C:\Users\msadoun\Desktop\YOLOv3---2\test.py", line 99, in test inf_out, train_out = model(imgs) # inference and training outputs File "C:\Users\msadoun\AppData\Local\Programs\Python\Python37\lib\site-packages\torch\nn\modules\module.py", line 532, in call result = self.forward(*input, **kwargs) File "C:\Users\msadoun\Desktop\YOLOv3---2\models.py", line 310, in forward return torch.cat(io, 1), p RuntimeError: invalid argument 0: Sizes of tensors must match except in dimension 1. Got 18 and 85 in dimension 2 at C:\w\1\s\windows\pytorch\aten\src\TH/generic/THTensor.cpp:612`

the input size is 5472x3648 (all images have the same size)

is there a solution for this problem?

github-actions[bot] commented 4 years ago

Hello @msadoun, thank you for your interest in our work! Please visit our Custom Training Tutorial to get started, and see our Google Colab Notebook, Docker Image, and GCP Quickstart Guide for example environments.

If this is a bug report, please provide screenshots and minimum viable code to reproduce your issue, otherwise we can not help you.

glenn-jocher commented 4 years ago

@msadoun your image size does not matter as all images are resized for training. It looks like you modified your cfg incorrectly. See https://docs.ultralytics.com/yolov5/tutorials/train_custom_data for directions.

msadoun commented 4 years ago

I have 13 classes and I changes the numberof filters to 54 (1+4+13)*3 and class count to 13 but I still get the problem

Capture

glenn-jocher commented 4 years ago

@msadoun start from coco16.data tutorial, which works perfectly well, and adapt your dataset from there.

jeevottamlokurti commented 4 years ago

I got same error followed all steps. instead of "RuntimeError: invalid argument 0: Sizes of tensors must match except in dimension 1. Got 18 and 85 in dimension 2" .I got "Sizes of tensors must match except in dimension 1. Got 48 and 85 in dimension 2 at C" .Please Help!!

glenn-jocher commented 4 years ago

@jeevottamlokurti this error happens when you misconfigure a cfg or try to use an incompatible weights cfg combination.

Just start from the tutorials https://docs.ultralytics.com/yolov5/tutorials/train_custom_data

jeevottamlokurti commented 4 years ago

Ya .. Thank you...I actually missed changing filters in all [convolution] which are above [yolo] in cfg file thanks you once again

github-actions[bot] commented 4 years ago

This issue is stale because it has been open 30 days with no activity. Remove Stale label or comment or this will be closed in 5 days.

glenn-jocher commented 12 months ago

@jeevottamlokurti you're welcome! Glad to hear that you found the issue. If you have any more questions or need further assistance, feel free to ask.