Closed starsky68 closed 4 years ago
Hello @starsky68, thank you for your interest in our work! Please visit our Custom Training Tutorial to get started, and see our Jupyter Notebook , Docker Image, and Google Cloud Quickstart Guide for example environments.
If this is a bug report, please provide screenshots and minimum viable code to reproduce your issue, otherwise we can not help you.
If this is a custom model or data training question, please note Ultralytics does not provide free personal support. As a leader in vision ML and AI, we do offer professional consulting, from simple expert advice up to delivery of fully customized, end-to-end production solutions for our clients, such as:
For more information please visit https://www.ultralytics.com.
@starsky68 I tried to reproduce this. Was not able to produce your error message, but I did see a leaf variable error message regarding lcls. I've just pushed a fix for this, and single class training now operates correctly. Please git pull and try again.
@starsky68 I tried to reproduce this. Was not able to produce your error message, but I did see a leaf variable error message regarding lcls. I've just pushed a fix for this, and single class training now operates correctly. Please git pull and try again.
thanks for your help
Dear @glenn-jocher,
I also have a similar problem with single-class training (with no preset weights):
Scanning labels ../data/labels.cache (284 found, 0 missing,
Scanning labels ../data/labels.cache (284 found, 0 missing,
Analyzing anchors... anchors/target = 0.85, Best Possible Recall (BPR) = 0.8521. Attempting to generate improved anchors, please wait...
Running kmeans for 9 anchors on 284 points...
thr=0.25: 1.0000 best possible recall, 9.00 anchors past thr
n=9, img_size=640, metric_all=0.735/0.994-mean/best, past_thr=0.735-mean: 390,14, 390,23, 391,24, 390,28, 390,31, 390,40
Evolving anchors with Genetic Algorithm: fitness = 0.9950: 100%|█| 1000/1000 [00
thr=0.25: 1.0000 best possible recall, 9.00 anchors past thr
n=9, img_size=640, metric_all=0.735/0.995-mean/best, past_thr=0.735-mean: 390,14, 390,23, 391,24, 391,28, 390,31, 390,40
Traceback (most recent call last):
File "train.py", line 448, in <module>
train(hyp, opt, device, tb_writer)
File "train.py", line 192, in train
check_anchors(dataset, model=model, thr=hyp['anchor_t'], imgsz=imgsz)
File "/home/laszlo/dev/yolov5/utils/general.py", line 101, in check_anchors
m.anchor_grid[:] = new_anchors.clone().view_as(m.anchor_grid) # for inference
RuntimeError: shape '[3, 1, 3, 1, 1, 2]' is invalid for input of size 12
... where the last number 12 is sometimes 14 or 16, depending on number of images. I'll try the workaround of defining a dummy class, to have a multiclass problem.
@treszkai that's pretty funny, you have some pretty uniform objects in your dataset. Unfortunately we can only act on reproducible errors, so if you can write up a short google colab notebook that we can run to reproduce this, we can get started debugging it. Otherwise there's nothing we can do. I'll paste you some additional debugging information below:
Please note that most technical problems are due to:
Your changes to the default repository. If your issue is not reproducible in a new git clone
version of this repository we can not debug it. Before going further run this code and ensure your issue persists:
sudo rm -rf yolov5 # remove existing
git clone https://github.com/ultralytics/yolov5 && cd yolov5 # clone latest
python detect.py # verify detection
# CODE TO REPRODUCE YOUR ISSUE HERE
Your custom data. If your issue is not reproducible with COCO or COCO128 data we can not debug it. Visit our Custom Training Tutorial for guidelines on training your custom data. Examine train_batch0.jpg
and test_batch0.jpg
for a sanity check of training and testing data.
Your environment. If your issue is not reproducible in one of the verified environments below we can not debug it. If you are running YOLOv5 locally, ensure your environment meets all of the requirements.txt dependencies specified below.
If none of these apply to you, we suggest you close this issue and raise a new one using the Bug Report template, providing screenshots and minimum viable code to reproduce your issue. Thank you!
Python 3.8 or later with all requirements.txt dependencies installed, including torch>=1.6
. To install run:
$ pip install -r requirements.txt
YOLOv5 may be run in any of the following up-to-date verified environments (with all dependencies including CUDA/CUDNN, Python and PyTorch preinstalled):
If this badge is green, all YOLOv5 GitHub Actions Continuous Integration (CI) tests are passing. These tests evaluate proper operation of basic YOLOv5 functionality, including training (train.py), testing (test.py), inference (detect.py) and export (export.py) on MacOS, Windows, and Ubuntu.
Thanks for the pointer, the data is indeed very uniform. I'll try again when I have some more variety.
Another easy option is, you see the shapes of your objects very clearly in your printscreen, you can just put those into your model yaml by hand.
You can also skip autoanchor entirely with python train.py --noautoanchor, but with the default anchors giving you a BPR of 0.85, you're mAP will never exceed 0.85.
Thanks for the quick and detailed response.
you have some pretty uniform objects in your dataset.
And wow, you have good eyes for this!
@treszkai I just spotted the problem. It looks like the scipy kmeans function we use for an initial evolution starting point will return less points than requested when the data is very similar. So you asked it for 9, and it returned 6 for example. That's just the immediate cause of your bug though, a much deeper issue is how to handle anchors correctly for varying or for not varying receptive fields.
Like I said your immediate solution is probably just to turn autoanchor off, and to plug in those values into your model.yaml file. Though I would set the small anchors (P3) to all zeros since they have low receptive field overlap with your objects.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
❔Question
For single-class data, this bug will appear
Additional context
File "D:\Python\lib\site-packages\torch\nn\modules\module.py", line 550, in call result = self.forward(*input, **kwargs) File "D:\learning\pythonWorkspace\PycharmProjects\yolov5-master\models\yolo.py", line 26, in forward x[i] = x[i].view(bs, self.na, self.no, ny, nx).permute(0, 1, 3, 4, 2).contiguous() RuntimeError: shape '[1, 3, 6, 48, 80]' is invalid for input of size 983040