ultralytics / yolov3

YOLOv3 in PyTorch > ONNX > CoreML > TFLite
https://docs.ultralytics.com
GNU Affero General Public License v3.0
10.22k stars 3.45k forks source link

shape #354

Closed MichaelCong closed 5 years ago

MichaelCong commented 5 years ago

RuntimeError: shape '[128, 64, 3, 3]' is invalid for input of size 44878

MichaelCong commented 5 years ago

python train.py Namespace(accumulate=8, backend='nccl', batch_size=8, cfg='cfg/yolov3-spp.cfg', data_cfg='data/coco_64img.data', dist_url='tcp://127.0.0.1:9999', epochs=100, evolve=False, giou=False, img_size=416, nosave=False, notest=False, num_workers=4, rank=0, resume=False, single_scale=False, transfer=False, var=0, world_size=1) Using CUDA device0 _CudaDeviceProperties(name='GeForce GTX 1080 Ti', total_memory=11175MB) device1 _CudaDeviceProperties(name='GeForce GTX 1080 Ti', total_memory=11178MB) device2 _CudaDeviceProperties(name='GeForce GTX 1080 Ti', total_memory=11178MB) device3 _CudaDeviceProperties(name='GeForce GTX 1080 Ti', total_memory=11178MB)

Traceback (most recent call last): File "train.py", line 330, in accumulate=opt.accumulate, File "train.py", line 103, in train cutoff = load_darknet_weights(model, weights + 'darknet53.conv.74') File "/home/rencong/yolov3/models.py", line 316, in load_darknet_weights conv_w = torch.from_numpy(weights[ptr:ptr + num_w]).view_as(conv_layer.weight) RuntimeError: shape '[128, 64, 3, 3]' is invalid for input of size 44878

glenn-jocher commented 5 years ago

Hello, thank you for your interest in our work! This is an automated response. Please note that most technical problems are due to:

If none of these apply to you, we suggest you close this issue and raise a new one using the Bug Report template, providing screenshots and minimum viable code to reproduce your issue. Thank you!

sanazss commented 5 years ago

Hi,

Thanks for your help so far. My images are square, do you think I need to tweak any part of the code to suit this? when I implement the letterbox function on my images. the img doesn't get the 128values and the last line of the function doesn't work on my images. do you have any suggestions. I get poor results for training and testing. Also, I got "Nan" as loss when I printed my losses. Any hint on this?

glenn-jocher commented 5 years ago

@sanazss I see you are posting many issues and trying many things, but I believe you are misdirecting your efforts. My suggestions are very simple:

  1. Modify your training data with the correct format.
  2. git pull the latest repo. Do not modify anything.
  3. Train to 100 epochs.
  4. Plot your training results using from utils import utils; utils.plot_results(). Upload your train_batch0.jpg, test_batch0.jpg, and results.png image here.

Without these 3 images I cant provide you any suggestions.