Closed MichaelCong closed 5 years ago
python train.py Namespace(accumulate=8, backend='nccl', batch_size=8, cfg='cfg/yolov3-spp.cfg', data_cfg='data/coco_64img.data', dist_url='tcp://127.0.0.1:9999', epochs=100, evolve=False, giou=False, img_size=416, nosave=False, notest=False, num_workers=4, rank=0, resume=False, single_scale=False, transfer=False, var=0, world_size=1) Using CUDA device0 _CudaDeviceProperties(name='GeForce GTX 1080 Ti', total_memory=11175MB) device1 _CudaDeviceProperties(name='GeForce GTX 1080 Ti', total_memory=11178MB) device2 _CudaDeviceProperties(name='GeForce GTX 1080 Ti', total_memory=11178MB) device3 _CudaDeviceProperties(name='GeForce GTX 1080 Ti', total_memory=11178MB)
Traceback (most recent call last):
File "train.py", line 330, in
Hello, thank you for your interest in our work! This is an automated response. Please note that most technical problems are due to:
git clone
version of this repository we can not debug it. Before going further run this code and ensure your issue persists:
sudo rm -rf yolov3 # remove exising repo
git clone https://github.com/ultralytics/yolov3 && cd yolov3 # git clone latest
python3 detect.py # verify detection
python3 train.py # verify training (a few batches only)
# CODE TO REPRODUCE YOUR ISSUE HERE
train_batch0.jpg
and test_batch0.jpg
for a sanity check of training and testing data.If none of these apply to you, we suggest you close this issue and raise a new one using the Bug Report template, providing screenshots and minimum viable code to reproduce your issue. Thank you!
Hi,
Thanks for your help so far. My images are square, do you think I need to tweak any part of the code to suit this? when I implement the letterbox function on my images. the img doesn't get the 128values and the last line of the function doesn't work on my images. do you have any suggestions. I get poor results for training and testing. Also, I got "Nan" as loss when I printed my losses. Any hint on this?
@sanazss I see you are posting many issues and trying many things, but I believe you are misdirecting your efforts. My suggestions are very simple:
git pull
the latest repo. Do not modify anything.utils import utils; utils.plot_results()
. Upload your train_batch0.jpg, test_batch0.jpg, and results.png image here.Without these 3 images I cant provide you any suggestions.
RuntimeError: shape '[128, 64, 3, 3]' is invalid for input of size 44878