Closed star4s closed 2 years ago
👋 Hello @star4s, thank you for your interest in YOLOv5 🚀! Please visit our ⭐️ Tutorials to get started, where you can find quickstart guides for simple tasks like Custom Data Training all the way to advanced concepts like Hyperparameter Evolution.
If this is a 🐛 Bug Report, please provide screenshots and minimum viable code to reproduce your issue, otherwise we can not help you.
If this is a custom training ❓ Question, please provide as much information as possible, including dataset images, training logs, screenshots, and a public link to online W&B logging if available.
For business inquiries or professional support requests please visit https://ultralytics.com or email support@ultralytics.com.
Python>=3.7.0 with all requirements.txt installed including PyTorch>=1.7. To get started:
git clone https://github.com/ultralytics/yolov5 # clone
cd yolov5
pip install -r requirements.txt # install
YOLOv5 may be run in any of the following up-to-date verified environments (with all dependencies including CUDA/CUDNN, Python and PyTorch preinstalled):
If this badge is green, all YOLOv5 GitHub Actions Continuous Integration (CI) tests are currently passing. CI tests verify correct operation of YOLOv5 training (train.py), validation (val.py), inference (detect.py) and export (export.py) on MacOS, Windows, and Ubuntu every 24 hours and on every commit.
@star4s this doesn't have anything to do with evolution.
AutoAnchor needs data to work, you have almost zero data in your dataset, or all your objects are the exact same size.
@star4s good news 😃! Your original issue may now be fixed ✅ in PR #6668. To receive this update:
git pull
from within your yolov5/
directory or git clone https://github.com/ultralytics/yolov5
againmodel = torch.hub.load('ultralytics/yolov5', 'yolov5s', force_reload=True)
sudo docker pull ultralytics/yolov5:latest
to update your image Thank you for spotting this issue and informing us of the problem. Please let us know if this update resolves the issue for you, and feel free to inform us of any other issues you discover or feature requests that come to mind. Happy trainings with YOLOv5 🚀!
**After I apply to your modified code with autoanchor.py , I have more errors!
After several AutoAnchor process, there are no more error.
and then I meet to some error like that.**
hyperparameters: lr0=0.00095, lrf=0.081, momentum=0.98, weight_decay=0.00051, warmup_epochs=2.58013, warmup_momentum=0.78619, warmup_bias_lr=0.12241, box=0.07791, cls=0.45521, cls_pw=1.32921, obj=1.05062, obj_pw=0.8944, iou_t=0.2, anchor_t=2.79565, anchors=4.49633, fl_gamma=0.0, hsv_h=0.0, hsv_s=0.0, hsv_v=0.0, degrees=0.0, translate=0.0, scale=0.0, shear=0.0, perspective=0.0, flipud=0.0, fliplr=0.0, mosaic=0.0, mixup=0.0, copy_paste=0.0 Weights & Biases: run 'pip install wandb' to automatically track and visualize YOLOv5 🚀 runs (RECOMMENDED) Overriding model.yaml anchors with anchors=4.49633
from n params module arguments
0 -1 1 8800 models.common.Conv [3, 80, 6, 2, 2]
1 -1 1 115520 models.common.Conv [80, 160, 3, 2]
2 -1 4 309120 models.common.C3 [160, 160, 4]
3 -1 1 461440 models.common.Conv [160, 320, 3, 2]
4 -1 8 2259200 models.common.C3 [320, 320, 8]
5 -1 1 1844480 models.common.Conv [320, 640, 3, 2]
6 -1 12 13125120 models.common.C3 [640, 640, 12]
7 -1 1 5531520 models.common.Conv [640, 960, 3, 2]
8 -1 4 11070720 models.common.C3 [960, 960, 4]
9 -1 1 11061760 models.common.Conv [960, 1280, 3, 2]
10 -1 4 19676160 models.common.C3 [1280, 1280, 4]
11 -1 1 4099840 models.common.SPPF [1280, 1280, 5]
12 -1 1 1230720 models.common.Conv [1280, 960, 1, 1]
13 -1 1 0 torch.nn.modules.upsampling.Upsample [None, 2, 'nearest']
14 [-1, 8] 1 0 models.common.Concat [1]
15 -1 4 11992320 models.common.C3 [1920, 960, 4, False]
16 -1 1 615680 models.common.Conv [960, 640, 1, 1]
17 -1 1 0 torch.nn.modules.upsampling.Upsample [None, 2, 'nearest']
18 [-1, 6] 1 0 models.common.Concat [1]
19 -1 4 5332480 models.common.C3 [1280, 640, 4, False]
20 -1 1 205440 models.common.Conv [640, 320, 1, 1]
21 -1 1 0 torch.nn.modules.upsampling.Upsample [None, 2, 'nearest']
22 [-1, 4] 1 0 models.common.Concat [1]
23 -1 4 1335040 models.common.C3 [640, 320, 4, False]
24 -1 1 922240 models.common.Conv [320, 320, 3, 2]
25 [-1, 20] 1 0 models.common.Concat [1]
26 -1 4 4922880 models.common.C3 [640, 640, 4, False]
27 -1 1 3687680 models.common.Conv [640, 640, 3, 2]
28 [-1, 16] 1 0 models.common.Concat [1]
29 -1 4 11377920 models.common.C3 [1280, 960, 4, False]
30 -1 1 8296320 models.common.Conv [960, 960, 3, 2]
31 [-1, 12] 1 0 models.common.Concat [1]
32 -1 4 20495360 models.common.C3 [1920, 1280, 4, False]
33 [23, 26, 29, 32] 1 89712 models.yolo.Detect [2, [[0, 1, 2, 3, 4, 5, 6, 7], [0, 1, 2, 3, 4, 5, 6, 7], [0, 1, 2, 3, 4, 5, 6, 7], [0, 1, 2, 3, 4, 5, 6, 7]], [320, 640, 960, 1280]]
Model Summary: 733 layers, 140067472 parameters, 140067472 gradients, 208.4 GFLOPs
Transferred 954/963 items from yolov5x6.pt Scaled weight_decay = 0.00051 optimizer: SGD with parameter groups 159 weight (no decay), 163 weight, 163 bias train: Scanning '/yolov5/datasets/OP1_test/labels/train.cache' images and labels... 14 found, 0 missing, 0 empty, 0 corrupt: 100%|██████████████████████| 14/14 [00:00<?, ?it/s] train: Caching images (0.1GB ram): 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 14/14 [00:00<00:00, 115.34it/s] val: Scanning '/yolov5/datasets/OP1_test/labels/train.cache' images and labels... 14 found, 0 missing, 0 empty, 0 corrupt: 100%|████████████████████████| 14/14 [00:00<?, ?it/s]
AutoAnchor: 0.00 anchors/target, 0.000 Best Possible Recall (BPR). Anchors are a poor fit to dataset ⚠️, attempting to improve...
AutoAnchor: Running kmeans for 16 anchors on 14 points...
AutoAnchor: ERROR: Cannot take a larger sample than population when 'replace=False'
Traceback (most recent call last):
File "train.py", line 638, in
@star4s I already told you:
AutoAnchor needs data to work, you have almost zero data in your dataset, or all your objects are the exact same size.
@glenn-jocher Hi, Thank you for your answer and your help.
My Custom Data information: . image size: 2352 X 1728 . class number: 17
My command for my custom data:
python train.py --img 2352 --batch 2 --epochs 5 --data test.yaml --cfg './models/yolov5x6.yaml' --weights yolov5x6.pt --cache --evolve 1000
My yolov5x6.yaml :
nc: 17 # number of classes depth_multiple: 1.33 # model depth multiple width_multiple: 1.25 # layer channel multiple
anchors:
backbone:
[ [ -1, 1, Focus, [ 64, 3 ] ], # 0-P1/2 [ -1, 1, Conv, [ 128, 3, 2 ] ], # 1-P2/4 [ -1, 3, C3, [ 128 ] ], [ -1, 1, Conv, [ 256, 3, 2 ] ], # 3-P3/8 [ -1, 9, C3, [ 256 ] ], [ -1, 1, Conv, [ 512, 3, 2 ] ], # 5-P4/16 [ -1, 9, C3, [ 512 ] ], [ -1, 1, Conv, [ 768, 3, 2 ] ], # 7-P5/32 [ -1, 3, C3, [ 768 ] ], [ -1, 1, Conv, [ 1024, 3, 2 ] ], # 9-P6/64 [ -1, 1, SPP, [ 1024, [ 3, 5, 7 ] ] ], [ -1, 3, C3, [ 1024, False ] ], # 11 ]
head: [ [ -1, 1, Conv, [ 768, 1, 1 ] ], [ -1, 1, nn.Upsample, [ None, 2, 'nearest' ] ], [ [ -1, 8 ], 1, Concat, [ 1 ] ], # cat backbone P5 [ -1, 3, C3, [ 768, False ] ], # 15 [ -1, 1, Conv, [ 512, 1, 1 ] ], [ -1, 1, nn.Upsample, [ None, 2, 'nearest' ] ], [ [ -1, 6 ], 1, Concat, [ 1 ] ], # cat backbone P4 [ -1, 3, C3, [ 512, False ] ], # 19 [ -1, 1, Conv, [ 256, 1, 1 ] ], [ -1, 1, nn.Upsample, [ None, 2, 'nearest' ] ], [ [ -1, 4 ], 1, Concat, [ 1 ] ], # cat backbone P3 [ -1, 3, C3, [ 256, False ] ], # 23 (P3/8-small) [ -1, 1, Conv, [ 256, 3, 2 ] ], [ [ -1, 20 ], 1, Concat, [ 1 ] ], # cat head P4 [ -1, 3, C3, [ 512, False ] ], # 26 (P4/16-medium) [ -1, 1, Conv, [ 512, 3, 2 ] ], [ [ -1, 16 ], 1, Concat, [ 1 ] ], # cat head P5 [ -1, 3, C3, [ 768, False ] ], # 29 (P5/32-large) [ -1, 1, Conv, [ 768, 3, 2 ] ], [ [ -1, 12 ], 1, Concat, [ 1 ] ], # cat head P6 [ -1, 3, C3, [ 1024, False ] ], # 32 (P6/64-xlarge) [ [ 23, 26, 29, 32 ], 1, Detect, [ nc, anchors ] ], # Detect(P3, P4, P5, P6) ]
I got change many data for AutoAnchor, and then I run Hyperpameter Evolution. I have the strange best result between coco128 and my data.
coco128 : Best generation: 246 Last generation: 299 metrics/precision, metrics/recall, metrics/mAP_0.5, metrics/mAP_0.5:0.95, val/box_loss 0.62719, 0.79158, 0.79514, 0.52422, 0.034782
My custom data: Best generation: 154 Last generation: 299 metrics/precision, metrics/recall , metrics/mAP_0.5 , metrics/mAP_0.5:0.95, val/box_loss 2.945e-06 , 0.00092593, 1.4938e-06, 2.9876e-07, 0.031016
This mean some thing wrong for my custom data? I have wrong annotation?
The most metrics/precision and metrics/recall from my custom data was 0. Why is zero in the case of most metrics/precision and metrics/recall from my custom data?
Thank you for your attention.
@star4s evolution should be run from a stable starting scenario. If your starting scenario returns zero mAP evolution will not help.
@glenn-jocher Thank you for your answer and your help.
You mentioned about a stable starting scenario.
What is the mean of a stable starting scenario?
Do I Need to modify structures of yolov5x6.yaml such as YOLOv5 head and YOLOv5 backbone?
@star4s Nothing needs to be modified anywhere. Evolution improves results on your base scenario. If your base scenario is returning zero mAP there's not much for evolution to work with.
Like any nonlinear optimization problem the final result is a function of the initial guess.
Search before asking
Question
Of course, the training is working with "--noautoanchor" by my custom data .
I need to use Hyperpameter Evolution for Hyperpameter tunning.
At first, I tested the example of COCO128.
python train.py --img-size 2352 --batch 1 --epochs 1 --data coco128.yaml --hyp './data/hyps/hyp.scratch.yaml' --cfg './models/yolov5x6.yaml' --weights yolov5x6.pt --cache --evolve &
The Hyperpameter Evolution of COCO128 is working well.
My command for my custom data:
python train.py --img 2352 --batch 1 --epochs 1 --data test.yaml --cfg './models/yolov5x6.yaml' --weights yolov5x6.pt --cache --evolve 2
After I run the command line, I meet to the error.
AutoAnchor: ERROR: AutoAnchor: ERROR: scipy.cluster.vq.kmeans requested 12 points but returned only 9.
I follow the instruction of Hyperpameter Evolution Guide.
My setting:
Name Version Build Channel
_libgcc_mutex 0.1 main
ca-certificates 2021.10.26 h06a4308_2
certifi 2021.10.8 py38h06a4308_2
cycler 0.11.0
fonttools 4.29.1
google-auth 2.6.0
google-auth-oauthlib 0.4.6
grpcio 1.43.0
idna 3.3
importlib-metadata 4.11.1
kiwisolver 1.3.2
ld_impl_linux-64 2.35.1 h7274673_9
matplotlib 3.5.1
ncurses 6.3 h7f8727e_2
oauthlib 3.2.0
opencv-python 4.5.5.62
openssl 1.1.1m h7f8727e_0
pandas 1.4.1
Pillow 9.0.1
pip 21.2.4 py38h06a4308_0
pyasn1 0.4.8
pyasn1-modules 0.2.8
pyparsing 3.0.7
python 3.8.12 h12debd9_0
pytz 2021.3
PyYAML 6.0
readline 8.1.2 h7f8727e_1
requests-oauthlib 1.3.1
rsa 4.8
scipy 1.8.0
seaborn 0.11.2
setuptools 58.0.4 py38h06a4308_0
sqlite 3.37.2 hc218d9a_0
tensorboard-data-server 0.6.1
tensorboard-plugin-wit 1.8.1
thop-0.0.31 2005241907
tk 8.6.11 h1ccaba5_0
torchvision 0.11.3
tqdm 4.62.3
typing_extensions 4.1.1
urllib3 1.26.8
Werkzeug 2.0.3
wheel 0.37.1 pyhd3eb1b0_0
zlib 1.2.11 h7f8727e_4
absl-py 1.0.0
cachetools 5.0.0
charset-normalizer 2.0.12
libffi 3.3 he6710b0_2
libgcc-ng 9.1.0 hdf63c60_0
libstdcxx-ng 9.1.0 hdf63c60_0
Markdown 3.3.6
numpy 1.22.2
packaging 21.3
protobuf 3.19.4
python-dateutil 2.8.2
requests 2.27.1
six 1.16.0
tensorboard 2.8.0
torch 1.10.2
xz 5.2.5 h7b6447c_0
zipp 3.7.0
How can I use Hyperpameter Evolution for my custom data?
Additional
a Label in my Custom data 1 0.52 0.921 0.072 0.098
After running Hyperparameters Evolution:
train: weights=yolov5x6.pt, cfg=./models/yolov5x6.yaml, data=OP1_WW.yaml, hyp=./data/hyps/hyp.scratch.yaml, epochs=1, batch_size=1, imgsz=640, rect=False, resume=False, nosave=False, noval=False, noautoanchor=False, evolve=2, bucket=, cache=ram, image_weights=False, device=, multi_scale=False, single_cls=False, optimizer=SGD, sync_bn=False, workers=8, project=runs/train, name=exp, exist_ok=False, quad=False, linear_lr=False, label_smoothing=0.0, patience=100, freeze=[0], save_period=-1, local_rank=-1, entity=None, upload_dataset=False, bbox_interval=-1, artifact_alias=latest github: skipping check (offline), for updates see https://github.com/ultralytics/yolov5 YOLOv5 🚀 2022-2-8 torch 1.10.2+cu102 CUDA:0 (Tesla V100-SXM2-16GB, 16160MiB)
hyperparameters: lr0=0.001, lrf=0.1, momentum=0.937, weight_decay=0.0005, warmup_epochs=3.0, warmup_momentum=0.8, warmup_bias_lr=0.1, box=0.05, cls=0.5, cls_pw=1.0, obj=1.0, obj_pw=1.0, iou_t=0.2, anchor_t=4.0, anchors=3, fl_gamma=0.0, hsv_h=0.0, hsv_s=0.0, hsv_v=0.0, degrees=0.0, translate=0.0, scale=0.0, shear=0.0, perspective=0.0, flipud=0.0, fliplr=0.0, mosaic=0.0, mixup=0.0, copy_paste=0.0 Weights & Biases: run 'pip install wandb' to automatically track and visualize YOLOv5 🚀 runs (RECOMMENDED) Overriding model.yaml anchors with anchors=3
0 -1 1 8800 models.common.Conv [3, 80, 6, 2, 2]
1 -1 1 115520 models.common.Conv [80, 160, 3, 2]
2 -1 4 309120 models.common.C3 [160, 160, 4]
3 -1 1 461440 models.common.Conv [160, 320, 3, 2]
4 -1 8 2259200 models.common.C3 [320, 320, 8]
5 -1 1 1844480 models.common.Conv [320, 640, 3, 2]
6 -1 12 13125120 models.common.C3 [640, 640, 12]
7 -1 1 5531520 models.common.Conv [640, 960, 3, 2]
8 -1 4 11070720 models.common.C3 [960, 960, 4]
9 -1 1 11061760 models.common.Conv [960, 1280, 3, 2]
10 -1 4 19676160 models.common.C3 [1280, 1280, 4]
11 -1 1 4099840 models.common.SPPF [1280, 1280, 5]
12 -1 1 1230720 models.common.Conv [1280, 960, 1, 1]
13 -1 1 0 torch.nn.modules.upsampling.Upsample [None, 2, 'nearest']
14 [-1, 8] 1 0 models.common.Concat [1]
15 -1 4 11992320 models.common.C3 [1920, 960, 4, False]
16 -1 1 615680 models.common.Conv [960, 640, 1, 1]
17 -1 1 0 torch.nn.modules.upsampling.Upsample [None, 2, 'nearest']
18 [-1, 6] 1 0 models.common.Concat [1]
19 -1 4 5332480 models.common.C3 [1280, 640, 4, False]
20 -1 1 205440 models.common.Conv [640, 320, 1, 1]
21 -1 1 0 torch.nn.modules.upsampling.Upsample [None, 2, 'nearest']
22 [-1, 4] 1 0 models.common.Concat [1]
23 -1 4 1335040 models.common.C3 [640, 320, 4, False]
24 -1 1 922240 models.common.Conv [320, 320, 3, 2]
25 [-1, 20] 1 0 models.common.Concat [1]
26 -1 4 4922880 models.common.C3 [640, 640, 4, False]
27 -1 1 3687680 models.common.Conv [640, 640, 3, 2]
28 [-1, 16] 1 0 models.common.Concat [1]
29 -1 4 11377920 models.common.C3 [1280, 960, 4, False]
30 -1 1 8296320 models.common.Conv [960, 960, 3, 2]
31 [-1, 12] 1 0 models.common.Concat [1]
32 -1 4 20495360 models.common.C3 [1920, 1280, 4, False]
33 [23, 26, 29, 32] 1 67284 models.yolo.Detect [2, [[0, 1, 2, 3, 4, 5], [0, 1, 2, 3, 4, 5], [0, 1, 2, 3, 4, 5], [0, 1, 2, 3, 4, 5]], [320, 640, 960, 1280]] Model Summary: 733 layers, 140045044 parameters, 140045044 gradients, 208.3 GFLOPs
Transferred 954/963 items from yolov5x6.pt Scaled weight_decay = 0.0005 optimizer: SGD with parameter groups 159 weight (no decay), 163 weight, 163 bias train: Scanning '/yolov5/datasets/OP1_test/labels/train.cache' images and labels... 14 found, 0 missing, 0 empty, 0 corrupt: 100%|██████████████████████| 14/14 [00:00<?, ?it/s] train: Caching images (0.0GB ram): 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 14/14 [00:00<00:00, 131.07it/s] val: Scanning '/yolov5/datasets/OP1_test/labels/train.cache' images and labels... 14 found, 0 missing, 0 empty, 0 corrupt: 100%|████████████████████████| 14/14 [00:00<?, ?it/s]
AutoAnchor: 1.71 anchors/target, 0.429 Best Possible Recall (BPR). Anchors are a poor fit to dataset ⚠️, attempting to improve... AutoAnchor: Running kmeans for 12 anchors on 14 points... AutoAnchor: ERROR: AutoAnchor: ERROR: scipy.cluster.vq.kmeans requested 12 points but returned only 9 Traceback (most recent call last): File "train.py", line 638, in
main(opt)
File "train.py", line 616, in main
results = train(hyp.copy(), opt, device, callbacks)
File "train.py", line 248, in train
check_anchors(dataset, model=model, thr=hyp['anchor_t'], imgsz=imgsz)
File "/yolov5/utils/autoanchor.py", line 55, in check_anchors
new_bpr = metric(anchors)[0]
File "/yolov5/utils/autoanchor.py", line 36, in metric
r = wh[:, None] / k[None]
RuntimeError: The size of tensor a (14) must match the size of tensor b (4) at non-singleton dimension 1
Exception in thread Thread-13:
Traceback (most recent call last):
File "yolo_2/lib/python3.8/threading.py", line 932, in _bootstrap_inner