ultralytics / yolov3

YOLOv3 in PyTorch > ONNX > CoreML > TFLite
https://docs.ultralytics.com
GNU Affero General Public License v3.0
10.08k stars 3.44k forks source link

KeyError: 'module_list.85.Conv2d.weight' #650

Closed alontrais closed 4 years ago

alontrais commented 4 years ago

Hey I get a new error whan I run the train script:

Downloading https://drive.google.com/uc?export=download&id=158g62Vs14E3aj7oPVPuEnNZMKFNgGyNq as weights/ultralytics49.pt... Done (2.8s)
Traceback (most recent call last):
  File "train.py", line 444, in <module>
    train()  # train normally
  File "train.py", line 111, in train
    chkpt['model'] = {k: v for k, v in chkpt['model'].items() if model.state_dict()[k].numel() == v.numel()}
  File "train.py", line 111, in <dictcomp>
    chkpt['model'] = {k: v for k, v in chkpt['model'].items() if model.state_dict()[k].numel() == v.numel()}
KeyError: 'module_list.85.Conv2d.weight'
daddydrac commented 4 years ago

I am having a much similar issue:

File "train.py", line 111, in <dictcomp> chkpt['model'] = {k: v for k, v in chkpt['model'].items() if model.state_dict()[k].numel() == v.numel()} KeyError: 'module_list.85.Conv2d.weight'

daddydrac commented 4 years ago

I think something is wrong w/ custom .cfg and/or .data file. because when I do a sanity check w/ default files I get:

'No labels found. Recommend correcting image and label paths.' AssertionError: No labels found. Recommend correcting image and label paths.

Please see, "Train On Custom Data" - https://github.com/ultralytics/yolov3/issues/621

FranciscoReveriano commented 4 years ago

Did you check the coco.data file? And your .cfg file should have nothing to do with this.

The easiest way to fix this is by making sure that you have a directory called 'labels' inside your data directory. In this directory you place all the labels for both the test/validation. Also make sure that you have the correct path names of your images. I have found relative paths to be better than then full paths.

image

daddydrac commented 4 years ago

Nope still broken

daddydrac commented 4 years ago

Why isn’t there instructions on simply running your own images thru it, while using coco/yolo, and getting some metrics like mAP and false positives and negatives? I can’t believe the docs have made it this hard. I’m willing to rewrite them if I can figure this out.

glenn-jocher commented 4 years ago

@alontrais @joehoeller thank you for your interest in our work! Please note that most technical problems are due to:

If none of these apply to you, we suggest you close this issue and raise a new one using the Bug Report template, providing screenshots and minimum viable code to reproduce your issue. Thank you!

yle8458 commented 4 years ago

@alontrais I had a similar error before, and I figured it out. The cause of this error in my end is because I used yolov3.cfg as my configure, but use the default weight file 'ultralytic49.pt', and the two does not match.

In the case that you want to use the default weight, you can use the yolov3-spp.cfg as a baseline and modify the corresponding filters/num_class as instructed.

daddydrac commented 4 years ago

@glenn-jocher I followed your instructions:

sudo rm -rf yolov3  # remove exising repo
git clone https://github.com/ultralytics/yolov3 && cd yolov3 # git clone latest
python3 detect.py  # verify detection
python3 train.py  # verify training (a few batches only)

I get this when I run train.py:

line 374, in __init__
    assert nf > 0, 'No labels found. Recommend correcting image and label paths.'
AssertionError: No labels found. Recommend correcting image and label paths.
glenn-jocher commented 4 years ago

@joehoeller you need the coco dataset to run the training examples:

$ bash yolov3/data/get_coco_dataset_gdrive.sh
daddydrac commented 4 years ago

Yes, I already did that. Is there something that needs to be done to the labels, other than putting them in /data folder? For example, should they be in the nested folders in which they came from? As @Fransisco stated above. (The labels were copied so their original path is still intact).

glenn-jocher commented 4 years ago

@joehoeller nothing needs to be done to the labels. You just git clone the repo, copy the coco dataset and train. You can even follow the notebook, just click play in each cell.

https://colab.research.google.com/drive/1G8T-VFxQkjDe4idzN8F-hbIBqkkkQnxw

daddydrac commented 4 years ago

I still get the error. Why did you close it?

On Sun, Nov 24, 2019 at 3:45 PM Glenn Jocher notifications@github.com wrote:

@joehoeller https://github.com/joehoeller nothing needs to be done to the labels. You just git clone the repo, copy the coco dataset and train. You can even follow the notebook, just click play in each cell.

https://colab.research.google.com/drive/1G8T-VFxQkjDe4idzN8F-hbIBqkkkQnxw

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ultralytics/yolov3/issues/650?email_source=notifications&email_token=ABHVQHEMX4ELZI3NHDSXMOTQVLYX7A5CNFSM4JQ3CBSKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEFAVVUQ#issuecomment-557931218, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABHVQHG4GM62Y76WBJX7SKTQVLYX7ANCNFSM4JQ3CBSA .

glenn-jocher commented 4 years ago

@joehoeller your error is not reproducible, there's no bug. Follow the steps, everything works properly.

daddydrac commented 4 years ago

That is false sir, because I did. And I get the error for the labels as shown.

On Sun, Nov 24, 2019 at 4:47 PM Glenn Jocher notifications@github.com wrote:

@joehoeller https://github.com/joehoeller your error is not reproducible, there's no bug. Follow the steps, everything works properly.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ultralytics/yolov3/issues/650?email_source=notifications&email_token=ABHVQHAS6QZX3AGBPNYGBWDQVL77PA5CNFSM4JQ3CBSKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEFAW4QQ#issuecomment-557936194, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABHVQHHRCTIAHIR3PW5DBFDQVL77PANCNFSM4JQ3CBSA .

glenn-jocher commented 4 years ago

@joehoeller To get started simply run the following in a terminal, or open the notebook and click play on the first cells (same code): https://colab.research.google.com/drive/1G8T-VFxQkjDe4idzN8F-hbIBqkkkQnxw

rm -rf yolov3 coco coco.zip  # WARNING: remove existing 
git clone https://github.com/ultralytics/yolov3  # clone
bash yolov3/data/get_coco_dataset_gdrive.sh  # copy COCO2014 dataset (19GB)
cd yolov3
python3 train.py
daddydrac commented 4 years ago

How many times do I have to tell you I did that. I’m moving on to build my own solution — which I can do, I was just hoping to save time.

On Sun, Nov 24, 2019 at 5:21 PM Glenn Jocher notifications@github.com wrote:

@joehoeller https://github.com/joehoeller To get started simply run the following in a terminal, or open the notebook and click play on the first cells (same code): https://colab.research.google.com/drive/1G8T-VFxQkjDe4idzN8F-hbIBqkkkQnxw

rm -rf yolov3 coco coco.zip # WARNING: remove existing git clone https://github.com/ultralytics/yolov3 # clone bash yolov3/data/get_coco_dataset_gdrive.sh # copy COCO2014 dataset (19GB) %cd yolov3 python3 train.py

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ultralytics/yolov3/issues/650?email_source=notifications&email_token=ABHVQHFARQIBSGLVIWYU6KLQVMEANA5CNFSM4JQ3CBSKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEFAXYFQ#issuecomment-557939734, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABHVQHBHLDPEZJIFQ65WGADQVMEANANCNFSM4JQ3CBSA .

FranciscoReveriano commented 4 years ago

How many times do I have to tell you I did that. I’m moving on to build my own solution — which I can do, I was just hoping to save time. On Sun, Nov 24, 2019 at 5:21 PM Glenn Jocher @.***> wrote: @joehoeller https://github.com/joehoeller To get started simply run the following in a terminal, or open the notebook and click play on the first cells (same code): https://colab.research.google.com/drive/1G8T-VFxQkjDe4idzN8F-hbIBqkkkQnxw rm -rf yolov3 coco coco.zip # WARNING: remove existing git clone https://github.com/ultralytics/yolov3 # clone bash yolov3/data/get_coco_dataset_gdrive.sh # copy COCO2014 dataset (19GB) %cd yolov3 python3 train.py — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#650?email_source=notifications&email_token=ABHVQHFARQIBSGLVIWYU6KLQVMEANA5CNFSM4JQ3CBSKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEFAXYFQ#issuecomment-557939734>, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABHVQHBHLDPEZJIFQ65WGADQVMEANANCNFSM4JQ3CBSA .

Don't be rude! Instead of complaining, you need to embrace the spirit of collaboration. This is the best PyTorch implementation public. Contribute to making it better.

FYI. If you are not on a notebook and you want to run this. I would advise that you follow the setup in that is made by

bash get_coco_dataset.sh

There you will get the perfect structure.

daddydrac commented 4 years ago

Let me make this more clear for you since you do not understand:

I followed the steps exactly as stated. Then I got the error message about the labels, which still persists.

So no it’s not the best. I’m making my own so I don’t waste any more time. I just thought I could save time using this, and clearly I was wrong.

glenn-jocher commented 4 years ago

@joehoeller if the default code I sent you works in your environment, then use that as a starting point for your own development efforts. You simply mimic the coco data format with your own data. All of the info, including step by step directions and code to reproduce are in the custom training example in the wiki. https://github.com/ultralytics/yolov3/wiki

daddydrac commented 4 years ago

It does not for the last time. How many times do I have to tell you. Scroll up and read the label error. Because that’s what I get after I performed the command line cmd’s as given per your instructions.

daddydrac commented 4 years ago

Actually don’t bother because I’m already hooking up analytics and metrics to my own solution I’ve built in Torch w Tensorboard.

Samjith888 commented 4 years ago

I got the same error, Traceback (most recent call last): File "train.py", line 444, in <module> train() # train normally File "train.py", line 111, in train chkpt['model'] = {k: v for k, v in chkpt['model'].items() if model.state_dict()[k].numel() == v.numel()} File "train.py", line 111, in <dictcomp> chkpt['model'] = {k: v for k, v in chkpt['model'].items() if model.state_dict()[k].numel() == v.numel()} KeyError: 'module_list.85.Conv2d.weight'

I have tried the suggested steps, but nothing worked out. https://github.com/ultralytics/yolov3/issues/650#issuecomment-557939734

inspire-lts commented 4 years ago

so sad! the same error: File "train.py", line 444, in train() # train normally File "train.py", line 111, in train chkpt['model'] = {k: v for k, v in chkpt['model'].items() if model.state_dict()[k].numel() == v.numel()} File "train.py", line 111, in chkpt['model'] = {k: v for k, v in chkpt['model'].items() if model.state_dict()[k].numel() == v.numel()}

glenn-jocher commented 4 years ago

@Samjith888 @inspire-lts @joehoeller see https://github.com/ultralytics/yolov3/issues/657

This error is caused by a user supplying incompatible --weights and --cfg arguments. To solve this you must specify no weights (i.e. random initialization of the model) using --weights '' and any --cfg, or use a --cfg that is compatible with your --weights. If none are specified, the defaults are --weights ultralytics49.pt and --cfg cfg/yolov3-spp.cfg.

Examples of compatible combinations are:

python3 train.py --weights yolov3.pt --cfg cfg/yolov3.cfg
python3 train.py --weights yolov3.weights --cfg cfg/yolov3.cfg
python3 train.py --weights yolov3-spp.pt --cfg cfg/yolov3-spp.cfg
python3 train.py --weights ultralytics49.pt --cfg cfg/yolov3-spp.cfg
python3 train.py --weights '' --cfg cfg/*.cfg  # any cfg will work here

ultralytics49.pt is currently the highest performing YOLOv3 model (trained from scratch using this repo) available at the default img-size of 416 (see https://github.com/ultralytics/yolov3/issues/310), which is the reason it is used as the default backbone.

daddydrac commented 4 years ago

So for the last time, what does this mean and how do I fix it:

assert nf > 0, 'No labels found. Recommend correcting image and label paths.'

daddydrac commented 4 years ago

A lot of this could be resolved if there was better docs and tutorials, with some minor improvements in the code. \How can we work together to make this happen?

On Mon, Nov 25, 2019 at 3:00 PM Glenn Jocher notifications@github.com wrote:

@Samjith888 https://github.com/Samjith888 @inspire-lts https://github.com/inspire-lts @joehoeller https://github.com/joehoeller see #657 https://github.com/ultralytics/yolov3/issues/657

This error is caused by a user supplying incompatible --weights and --cfg arguments. To solve this you must specify no weights (i.e. random initialization of the model) using --weights '' and any --cfg, or use a --cfg that is compatible with your --weights. If none are specified, the defaults are --weights ultralytics49.pt and --cfg cfg/yolov3-spp.cfg.

Examples of compatible combinations are:

python3 train.py --weights yolov3.pt --cfg cfg/yolov3.cfg python3 train.py --weights yolov3.weights --cfg cfg/yolov3.cfg python3 train.py --weights yolov3-spp.pt --cfg cfg/yolov3-spp.cfg python3 train.py --weights ultralytics49.pt --cfg cfg/yolov3-spp.cfg python3 train.py --weights '' --cfg cfg/*.cfg # any cfg will work here

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ultralytics/yolov3/issues/650?email_source=notifications&email_token=ABHVQHC7LZKFTZGM75L2MJTQVQ4ILA5CNFSM4JQ3CBSKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEFDY7NY#issuecomment-558337975, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABHVQHAQBVMTBZ5JKGX6B63QVQ4ILANCNFSM4JQ3CBSA .

FranciscoReveriano commented 4 years ago

They were pretty clear in their meanings. But it took me a second try to get my labels working. I am going to write a Medium article on how to use the Ultralytics model to train better. And write a wiki page on distributive computing. It is a great model. With some amazing work. But I feel that if we all contribute it can be the top Yolov3 model.

daddydrac commented 4 years ago

For sure, I have some ideas and some more ppl in CV space willing to help as well.

On Wed, Nov 27, 2019 at 9:24 AM Francisco Reveriano < notifications@github.com> wrote:

They were pretty clear in their meanings. But it took me a second try to get my labels working. I am going to write a Medium article on how to use the Ultralytics model to train better. And write a wiki page on distributive computing. It is a great model. With some amazing work. But I feel that if we all contribute it can be the top Yolov3 model.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ultralytics/yolov3/issues/650?email_source=notifications&email_token=ABHVQHACMNKX5BPH4UIZB4TQV2GJDA5CNFSM4JQ3CBSKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEFJ3BMY#issuecomment-559132851, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABHVQHAB2MB7ZPCTT336NXDQV2GJDANCNFSM4JQ3CBSA .

FranciscoReveriano commented 4 years ago

Send me a message we can collaborate in article. Or add me at Linked which is my profile.

daddydrac commented 4 years ago

Will do. Thanks!

glenn-jocher commented 4 years ago

I just updated the mAP section of the README with the latest results. We're making good progress on training. Earlier in the year we were behind darknet, now we are ahead in most metrics, using the same yolov3-spp.cfg architecture. The best results now are from ultralytics68.pt, which I should have up on the Google Drive folder soon.

https://github.com/ultralytics/yolov3#map

320mAP@0.5:0.95 416mAP@0.5:0.95 608mAP@0.5:0.95
darknet YOLOv3-tiny 14.0 16.0 16.6
darknet YOLOv3 28.7 31.1 33.0
darknet YOLOv3-SPP 30.5 33.9 37.0
ultralytics YOLOv3-SPP 35.2 38.8 40.4

Yes a medium article and better docs would be great! I don't have much time unfortunately though, between running different trainings and developing/debugging.

daddydrac commented 4 years ago

How do we correct image/label paths? I have them but it is not clear as to where to set those up at.

AssertionError: No labels found. Recommend correcting image and label paths.

daddydrac commented 4 years ago

Send me a message we can collaborate in article. Or add me at Linked which is my profile.

Message is in your LinkedIn inbox. I built the automation tool, I now call "Dark Chocolate", it converts COCO annotations to Darknet annotation format.

glenn-jocher commented 4 years ago

@joehoeller coco.data points to the train.txt and test.txt list of images on lines 2 and 3. image

These files have lists of image paths as they would be from the yolov3 directory: image

If in doubt, you can run python3 train.py in debug mode, and put a breakpoint on this line to see what values the img_files are. If there are no images there, or if there are no labels in the corresponding labels folder (by replacing /images/ with /labels/ in the image paths) you will get this error message.

https://github.com/ultralytics/yolov3/blob/5bcc2b38b8d8d2fbf7edd84cfbca8c7063cb4bfe/utils/datasets.py#L261-L262

daddydrac commented 4 years ago

These are my own images and my own annotations (in Darknet format).

The images you show are just the paths to the images. Not the labels.

On Sat, Nov 30, 2019 at 9:38 PM Glenn Jocher notifications@github.com wrote:

@joehoeller https://github.com/joehoeller coco.data points to the train.txt and test.txt list of images on lines 2 and 3. [image: image] https://user-images.githubusercontent.com/26833433/69908986-7402de80-13a8-11ea-862c-c75765f5d790.png

These files have lists of image paths as they would be from the yolov3 directory: [image: image] https://user-images.githubusercontent.com/26833433/69908992-81b86400-13a8-11ea-8a8f-4b4f41476b89.png

If in doubt, you can run python3 train.py in debug mode, and put a breakpoint on this line to see what values the img_files are. If there are no images there, or if there are no labels in the corresponding labels folder (by replacing /images/ with /labels/ in the image paths) you will get this error message.

https://github.com/ultralytics/yolov3/blob/5bcc2b38b8d8d2fbf7edd84cfbca8c7063cb4bfe/utils/datasets.py#L261-L262

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ultralytics/yolov3/issues/650?email_source=notifications&email_token=ABHVQHEYQIWXZCWJMGMAKWTQWMWSHA5CNFSM4JQ3CBSKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEFQ2V5Q#issuecomment-560048886, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABHVQHEUXYIS6J2LDUNWI6DQWMWSHANCNFSM4JQ3CBSA .

FranciscoReveriano commented 4 years ago

I will be uploading a reader for this if you have a custom Dataset. All I can say is that its better if you are using the full path of the images. So the computer knows where to grab the images/labels.

glenn-jocher commented 4 years ago

@joehoeller the same structure is used for custom data as for coco. The labels need to be in a separate folder next to the images folder. The labels folder needs to be found simply by replacing /images/ with /labels/ in the image folder path, like this custom "dataset1" (ds1). Each labelname is identical to each image name, except the extension for the labels is *.txt. This example trains on the first 8 images of the dataset, and tests on the last 2.

The paths all need to be relative to your yolov3 folder (or absolute paths, though these break easier if you send the code to a different environment).

Screen Shot 2019-12-01 at 1 37 27 PM

Then run:

cd yolov3
python3 train.py --data ../data/ds1/out.data
glenn-jocher commented 4 years ago

BTW @FranciscoReveriano @joehoeller this is legacy structure from darknet, so the same exact data can also be used to train darknet.

This repo now outperforms darknet by a wide margin I believe, but nevertheless darknet has a strong following (i.e. pjreddier/darknet has 15k stars, alexeyab/darknet has 6k stars), so I'm not sure if we should keep following the darknet convention, or perhaps start from a clean-slate mentality about what would be easiest for the most people to train their own custom data with a minimum of hassle.

In principle this repo is here to create the most accurate, fastest object detector in the world. In practice though, people seem to care more about quick results and ease of use, and don't care as much about being the best or the fastest.

FranciscoReveriano commented 4 years ago

I think we need to continue to DarkNet. I guess people still follow it because it provides a nice benchmark with a lot of literature. Although I don't think Machine Learning or Object Detection should be 'people'-proof. At some point people should be expected to do the learning curve. Seems like alot of people just want quick fixes.

Although it might not be a bad idea to make a version of Facebook's Detectron 2 that could be sold. That would be the best way to start from a clean state in my opinion.

daddydrac commented 4 years ago

I agree — one my biggest rants is how 98% of Git repos in CV space are awful and basic. People need to learn the concepts as well as the math. (I took the Udacity CV course and recommend it because it dives deep in Torch and math). However for small one off’s things like Darknet are perfect.

On Mon, Dec 2, 2019 at 8:25 AM Francisco Reveriano notifications@github.com wrote:

I think we need to continue to DarkNet. I guess people still follow it because it provides a nice benchmark with a lot of literature. Although I don't think Machine Learning or Object Detection should be 'people'-proof. At some point people should be expected to do the learning curve. Seems like alot of people just want quick fixes.

Although it might not be a bad idea to make a version of Facebook's Detectron 2 that could be sold. That would be the best way to start from a clean state in my opinion.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ultralytics/yolov3/issues/650?email_source=notifications&email_token=ABHVQHHE6F5RASKDYTBR6STQWULHLA5CNFSM4JQ3CBSKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEFTU4HY#issuecomment-560418335, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABHVQHEPBVVEBAEYDBW6RBDQWULHLANCNFSM4JQ3CBSA .

FranciscoReveriano commented 4 years ago

For me. The problem is when people ask you to interpret, figure out, or tell them to how to make their results much better. This is GitHub not ResearchGate. I was looking for a Udacity course to take this break. I might do that CV course. Most my experience is with Tensorflow and Keras. Trying to move to Torch like the rest of us.

daddydrac commented 4 years ago

COCO JSON to Darknet/YOLOv3 annotation conversion tool, see readme for instructions and how to validate: https://github.com/joehoeller/Dark-Chocolate

On Mon, Dec 2, 2019 at 9:23 AM Francisco Reveriano notifications@github.com wrote:

For me. The problem is when people ask you to interpret, figure out, or tell them to how to make their results much better. This is GitHub not ResearchGate. I was looking for a Udacity course to take this break. I might do that CV course. Most my experience is with Tensorflow and Keras. Trying to move to Torch like the rest of us.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ultralytics/yolov3/issues/650?email_source=notifications&email_token=ABHVQHBV2CTIFCI3HJJRQJDQWUR7HA5CNFSM4JQ3CBSKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEFT26FQ#issuecomment-560443158, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABHVQHFCLTLGEMWGFNU6TH3QWUR7HANCNFSM4JQ3CBSA .

daddydrac commented 4 years ago

@glenn-jocher you show paths for images but not labels - i have been doing all of this already, and just like others in this thread it continues to fail.

glenn-jocher commented 4 years ago

@joehoeller the label paths are inferred automatically by replacing /images/ with /labels/ in the image paths. You only need to specify image paths.

glenn-jocher commented 4 years ago

The labelfile definition happens here.

https://github.com/ultralytics/yolov3/blob/3d91731519dcbea9a1a2047817ba83ce1441358f/utils/datasets.py#L278-L281

daddydrac commented 4 years ago

@Samjith888 @inspire-lts @joehoeller see #657

This error is caused by a user supplying incompatible --weights and --cfg arguments. To solve this you must specify no weights (i.e. random initialization of the model) using --weights '' and any --cfg, or use a --cfg that is compatible with your --weights. If none are specified, the defaults are --weights ultralytics49.pt and --cfg cfg/yolov3-spp.cfg.

Examples of compatible combinations are:

python3 train.py --weights yolov3.pt --cfg cfg/yolov3.cfg
python3 train.py --weights yolov3.weights --cfg cfg/yolov3.cfg
python3 train.py --weights yolov3-spp.pt --cfg cfg/yolov3-spp.cfg
python3 train.py --weights ultralytics49.pt --cfg cfg/yolov3-spp.cfg
python3 train.py --weights '' --cfg cfg/*.cfg  # any cfg will work here

ultralytics49.pt is currently the highest performing YOLOv3 model (trained from scratch using this repo) available at the default img-size of 416 (see #310), which is the reason it is used as the default backbone.

This tutorial, https://docs.ultralytics.com/yolov5/tutorials/train_custom_data , says:

  1. Train. Run python3 train.py --data data/coco_10img.data to train using your custom data. If you created a custom *.cfg file as well, specify it using --cfg cfg/my_new_file.cfg.

I HAVE TRIED ALL OF THE SUGGESTIONS ABOVE AND STILL GET: assert nf > 0, 'No labels found. Recommend correcting image and label paths.

daddydrac commented 4 years ago

This script will generate file paths to images:

import os
filee = open('FILE_NAME.txt','w')
given_dir = 'PATH_TO_CUSTOM_IMAGES'
[filee.write(os.path.join(given_dir,i)+'\n') for i in os.listdir(given_dir)]
daddydrac commented 4 years ago

I got it going, now have CUDA memory error, but that's a "me" problem. Not a "you" problem. I will write a very clear and concise tutorial for medium when I am done.

glenn-jocher commented 4 years ago

Yes, I think the default training settings should probably use a smaller batch size. The current settings should work fine for a 1080Ti or 2080Ti and up (11GB) cuda memory, but smaller graphics cards may run out.

The current default is --batch-size 32 --accumulate 2 to get to an effective 64 batch size. I think I should reduce this to --batch-size 16 --accumulate 4 to get the most number of people running smoothly without CUDA out of memory issues. The performance hit (from batch norming less images) is not very large.

glenn-jocher commented 4 years ago

Ok, this should do it: https://github.com/ultralytics/yolov3/commit/93a70d958a1138b082f7b5c29c550b7d383f56f3

If you git pull you can get all the latest updates.