Closed kongkk233 closed 3 years ago
Hello @kongkk233, thank you for your interest in 🚀 YOLOv5! Please visit our ⭐️ Tutorials to get started, where you can find quickstart guides for simple tasks like Custom Data Training all the way to advanced concepts like Hyperparameter Evolution.
If this is a 🐛 Bug Report, please provide screenshots and minimum viable code to reproduce your issue, otherwise we can not help you.
If this is a custom training ❓ Question, please provide as much information as possible, including dataset images, training logs, screenshots, and a public link to online W&B logging if available.
For business inquiries or professional support requests please visit https://www.ultralytics.com or email Glenn Jocher at glenn.jocher@ultralytics.com.
Python 3.8 or later with all requirements.txt dependencies installed, including torch>=1.7
. To install run:
$ pip install -r requirements.txt
YOLOv5 may be run in any of the following up-to-date verified environments (with all dependencies including CUDA/CUDNN, Python and PyTorch preinstalled):
If this badge is green, all YOLOv5 GitHub Actions Continuous Integration (CI) tests are currently passing. CI tests verify correct operation of YOLOv5 training (train.py), testing (test.py), inference (detect.py) and export (export.py) on MacOS, Windows, and Ubuntu every 24 hours and on every commit.
@kongkk233 hello, thank you for your interest in YOLOv5! This issue seems to lack the minimum requirements for a proper response, or is insufficiently detailed for us to help you. Please note that most technical problems are due to:
git clone
version of this repo we can not debug it. Before going further run this code and verify your issue persists:
$ git clone https://github.com/ultralytics/yolov5 yolov5_new # clone latest
$ cd yolov5_new
$ python detect.py # verify detection
- **Your custom data.** If your issue is not reproducible in one of our 3 common datasets ([COCO](https://github.com/ultralytics/yolov5/blob/master/data/coco.yaml), [COCO128](https://github.com/ultralytics/yolov5/blob/master/data/coco128.yaml), or [VOC](https://github.com/ultralytics/yolov5/blob/master/data/voc.yaml)) we can not debug it. Visit our [Custom Training Tutorial](https://docs.ultralytics.com/yolov5/tutorials/train_custom_data) for guidelines on training your custom data. Examine `train_batch0.jpg` and `test_batch0.jpg` for a sanity check of your labels and images.
- **Your environment.** If your issue is not reproducible in one of the verified environments below we can not debug it. If you are running YOLOv5 locally, verify your environment meets all of the [requirements.txt](https://github.com/ultralytics/yolov5/blob/master/requirements.txt) dependencies specified below. If in doubt, download Python 3.8.0 from https://www.python.org/, create a new [venv](https://packaging.python.org/guides/installing-using-pip-and-virtual-environments/), and then install requirements.
If none of these apply to you, we suggest you close this issue and raise a new one using the **Bug Report template**, providing screenshots and **minimum viable code to reproduce your issue**. Thank you!
## Requirements
Python 3.8 or later with all [requirements.txt](https://github.com/ultralytics/yolov5/blob/master/requirements.txt) dependencies installed, including `torch>=1.6`. To install run:
```bash
$ pip install -r requirements.txt
YOLOv5 may be run in any of the following up-to-date verified environments (with all dependencies including CUDA/CUDNN, Python and PyTorch preinstalled):
If this badge is green, all YOLOv5 GitHub Actions Continuous Integration (CI) tests are passing. These tests evaluate proper operation of basic YOLOv5 functionality, including training (train.py), testing (test.py), inference (detect.py) and export (export.py) on MacOS, Windows, and Ubuntu.
When I train in Colab for a period of time, this error will occur, and it will appear every time I train. The pictures in the dataset are all normal and will not be None.
@kongkk233 please supply a Colab notebook with a reproducible example.
@glenn-jocher thanks for your reply
input
!python train.py --img 416 --batch 16 --epochs 100 --data LSB.yaml --weights yolov5l.pt
output
Using CUDA device0 _CudaDeviceProperties(name='Tesla K80', total_memory=11441MB)
Namespace(adam=False, batch_size=16, bucket='', cache_images=False, cfg='models/yolov5l.yaml', data='./data/LSB.yaml', device='', epochs=100, evolve=False, global_rank=-1, hyp='data/hyp.scratch.yaml', image_weights=False, img_size=[416, 416], local_rank=-1, logdir='runs/', multi_scale=False, name='', noautoanchor=False, nosave=False, notest=False, rect=False, resume=False, single_cls=False, sync_bn=False, total_batch_size=16, weights='yolov5l.pt', workers=8, world_size=1)
Start Tensorboard with "tensorboard --logdir runs/", view at http://localhost:6006/
2020-11-26 11:24:39.627558: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.1
Hyperparameters {'lr0': 0.01, 'lrf': 0.2, 'momentum': 0.937, 'weight_decay': 0.0005, 'warmup_epochs': 3.0, 'warmup_momentum': 0.8, 'warmup_bias_lr': 0.1, 'box': 0.05, 'cls': 0.5, 'cls_pw': 1.0, 'obj': 1.0, 'obj_pw': 1.0, 'iou_t': 0.2, 'anchor_t': 4.0, 'fl_gamma': 0.0, 'hsv_h': 0.015, 'hsv_s': 0.7, 'hsv_v': 0.4, 'degrees': 0.0, 'translate': 0.1, 'scale': 0.5, 'shear': 0.0, 'perspective': 0.0, 'flipud': 0.0, 'fliplr': 0.5, 'mosaic': 1.0, 'mixup': 0.0}
from n params module arguments
0 -1 1 7040 models.common.Focus [3, 64, 3]
1 -1 1 73984 models.common.Conv [64, 128, 3, 2]
2 -1 1 161152 models.common.BottleneckCSP [128, 128, 3]
3 -1 1 295424 models.common.Conv [128, 256, 3, 2]
4 -1 1 1627904 models.common.BottleneckCSP [256, 256, 9]
5 -1 1 1180672 models.common.Conv [256, 512, 3, 2]
6 -1 1 6499840 models.common.BottleneckCSP [512, 512, 9]
7 -1 1 4720640 models.common.Conv [512, 1024, 3, 2]
8 -1 1 2624512 models.common.SPP [1024, 1024, [5, 9, 13]]
9 -1 1 10234880 models.common.BottleneckCSP [1024, 1024, 3, False]
10 -1 1 525312 models.common.Conv [1024, 512, 1, 1]
11 -1 1 0 torch.nn.modules.upsampling.Upsample [None, 2, 'nearest']
12 [-1, 6] 1 0 models.common.Concat [1]
13 -1 1 2823680 models.common.BottleneckCSP [1024, 512, 3, False]
14 -1 1 131584 models.common.Conv [512, 256, 1, 1]
15 -1 1 0 torch.nn.modules.upsampling.Upsample [None, 2, 'nearest']
16 [-1, 4] 1 0 models.common.Concat [1]
17 -1 1 707328 models.common.BottleneckCSP [512, 256, 3, False]
18 -1 1 590336 models.common.Conv [256, 256, 3, 2]
19 [-1, 14] 1 0 models.common.Concat [1]
20 -1 1 2561536 models.common.BottleneckCSP [512, 512, 3, False]
21 -1 1 2360320 models.common.Conv [512, 512, 3, 2]
22 [-1, 10] 1 0 models.common.Concat [1]
23 -1 1 10234880 models.common.BottleneckCSP [1024, 1024, 3, False]
24 [17, 20, 23] 1 32310 models.yolo.Detect [1, [[10, 13, 16, 30, 33, 23], [30, 61, 62, 45, 59, 119], [116, 90, 156, 198, 373, 326]], [256, 512, 1024]]
Model Summary: 335 layers, 4.73933e+07 parameters, 4.73933e+07 gradients
Transferred 650/658 items from yolov5l.pt
Optimizer groups: 110 .bias, 118 conv.weight, 107 other
Scanning labels /content/drive/LSB/yolov5-3.1/data/LSB/labels/train2020.cache (890 found, 0 missing, 0 empty, 0 duplicate, for 890 images): 890it [00:00, 13522.22it/s]
Scanning labels /content/drive/LSB/yolov5-3.1/data/LSB/labels/val2020.cache (110 found, 0 missing, 0 empty, 0 duplicate, for 110 images): 110it [00:00, 8486.12it/s]
NumExpr defaulting to 2 threads.
Analyzing anchors... anchors/target = 4.47, Best Possible Recall (BPR) = 1.0000
Image sizes 416 train, 416 test
Using 2 dataloader workers
Logging results to runs/exp5
Starting training for 100 epochs...
Epoch gpu_mem box obj cls total targets img_size
0/99 4.33G 0.08804 0.02441 0 0.1125 18 416: 100% 56/56 [24:22<00:00, 26.12s/it]
Class Images Targets P R mAP@.5 mAP@.5:.95: 71% 5/7 [00:20<00:10, 5.15s/it]Corrupt JPEG data: 32608 extraneous bytes before marker 0xd9
Class Images Targets P R mAP@.5 mAP@.5:.95: 100% 7/7 [00:38<00:00, 5.53s/it]
all 110 0 0 0 0 0
Premature end of JPEG file
Epoch gpu_mem box obj cls total targets img_size
1/99 4.49G 0.08051 0.02256 0 0.1031 19 416: 100% 56/56 [26:07<00:00, 28.00s/it]
Class Images Targets P R mAP@.5 mAP@.5:.95: 43% 3/7 [00:01<00:01, 2.23it/s]Traceback (most recent call last):
File "train.py", line 460, in <module>
train(hyp, opt, device, tb_writer)
File "train.py", line 320, in train
plots=epoch == 0 or final_epoch) # plot first and last
File "/content/drive/LSB/yolov5-3.1/test.py", line 95, in test
for batch_i, (img, targets, paths, shapes) in enumerate(tqdm(dataloader, desc=s)):
File "/usr/local/lib/python3.6/dist-packages/tqdm/std.py", line 1104, in __iter__
for obj in iterable:
File "/content/drive/LSB/yolov5-3.1/utils/datasets.py", line 91, in __iter__
yield next(self.iterator)
File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/dataloader.py", line 435, in __next__
data = self._next_data()
File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/dataloader.py", line 1065, in _next_data
return self._process_data(data)
File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/dataloader.py", line 1111, in _process_data
data.reraise()
File "/usr/local/lib/python3.6/dist-packages/torch/_utils.py", line 428, in reraise
raise self.exc_type(msg)
AssertionError: Caught AssertionError in DataLoader worker process 0.
Original Traceback (most recent call last):
File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/_utils/worker.py", line 198, in _worker_loop
data = fetcher.fetch(index)
File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/_utils/fetch.py", line 44, in <listcomp>
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/content/drive/LSB/yolov5-3.1/utils/datasets.py", line 537, in __getitem__
img, (h0, w0), (h, w) = load_image(self, index)
File "/content/drive/LSB/yolov5-3.1/utils/datasets.py", line 616, in load_image
assert img is not None, 'Image Not Found ' + path
AssertionError: Image Not Found /content/drive/LSB/yolov5-3.1/data/LSB/images/val2020/976.jpg
Class Images Targets P R mAP@.5 mAP@.5:.95: 43% 3/7 [00:06<00:08, 2.14s/it]
This issue has been mentioned by others
I have another question: The image size in my dataset is 2048*1489, and the model input size is 416*416. Do I need to make my labels file according to 416*416? In this case, my training speed is very slow, is it because the pictures are too big? Do I need to convert the picture to 416*416 by myself.Thanks.
@kongkk233 can provide a reproducible example in a colab notebook, so we can simply click run and see the error?
@glenn-jocher I have solved this question.I use --cache while training.Thank you very much.
@kongkk233 that's odd. Do you think it may have to do with the #195 result where the image was a gif?
@glenn-jocher no.My Image format is png.I think it’s because every epoch needs to re-read the image in Google Drive. Sometimes it will go wrong.
Oh, then you simply have network issues. You should always train with local data, never with remote buckets/drives.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Hello @kongkk233, thank you for your interest in 🚀 YOLOv5! Please visit our ⭐️ Tutorials to get started, where you can find quickstart guides for simple tasks like Custom Data Training all the way to advanced concepts like Hyperparameter Evolution.
If this is a 🐛 Bug Report, please provide screenshots and minimum viable code to reproduce your issue, otherwise we can not help you.
If this is a custom training ❓ Question, please provide as much information as possible, including dataset images, training logs, screenshots, and a public link to online W&B logging if available.
For business inquiries or professional support requests please visit https://www.ultralytics.com or email Glenn Jocher at glenn.jocher@ultralytics.com.
Requirements
Python 3.8 or later with all requirements.txt dependencies installed, including
torch>=1.7
. To install run:$ pip install -r requirements.txt
Environments
YOLOv5 may be run in any of the following up-to-date verified environments (with all dependencies including CUDA/CUDNN, Python and PyTorch preinstalled):
- Google Colab Notebook with free GPU:
- Kaggle Notebook with free GPU: https://www.kaggle.com/ultralytics/yolov5
- Google Cloud Deep Learning VM. See GCP Quickstart Guide
- Docker Image https://hub.docker.com/r/ultralytics/yolov5. See Docker Quickstart Guide
Status
If this badge is green, all YOLOv5 GitHub Actions Continuous Integration (CI) tests are currently passing. CI tests verify correct operation of YOLOv5 training (train.py), testing (test.py), inference (detect.py) and export (export.py) on MacOS, Windows, and Ubuntu every 24 hours and on every commit.
You should name this file path using English.
I also met the same problem. How you solved it?
@kongkk233
@glenn-jocher I have solved this question.I use --cache while training.Thank you very much.
Hello, I also encountered the same situation, may I ask how you solved it? Can you provide specific solutions?
You had to delete the "cache" File in the "data/LSB/labels/xxx.cache"
I also met the same problem. How you solved it?
Ways to solve this issue:
@PawanKuma hi there,
To solve this issue, you can try the following approaches:
Update the train, test, and val paths in your custom training YAML file (usually named yolov7cstm.yml) to ensure they are pointing to the correct directories where your images and labels are located.
Clear the label cache before every fresh run. The label cache file is created inside the data folder, so you can delete the cache file to ensure that fresh labels are loaded during training.
These steps should help resolve the "Image Not Found" issue you are facing during training.
Let me know if you need any further assistance.
Thanks!
@glenn-jocher I am currently trying to train a yolov6 model in google colab and I am getting the above error.
(Traceback (most recent call last):
File "/content/YOLOv6/tools/train.py", line 142, in
Is it because I imported the dataset from my Google Drive? And if that's the case, is what you're saying above referring to my situation?(glenn-jocher commented on Nov 27, 2020 Oh, then you simply have network issues. You should always train with local data, never with remote buckets/drives.)
@Cho-Hong-Seok hi there! It seems like the issue you're encountering is due to the path to your dataset not being correctly identified by the code. This can occur for a variety of reasons, including incorrect file paths or permission issues accessing Google Drive.
While training on Google Colab, remember that your notebook needs permission to access files on Google Drive. Make sure you've mounted your Google Drive correctly and the path you've specified in your training script exactly matches the location of your dataset.
Here's a quick snippet to ensure Google Drive is mounted correctly:
from google.colab import drive
drive.mount('/content/drive')
Verify your dataset path after mounting. For example:
!ls /content/drive/MyDrive/[DILab_data]/Computer_Vision/Fire_detection/FST1/FST1/train/images
This will list all files in the specified directory, confirming the path is accurate.
The comment you've referenced about network issues pertains to physical distance between data and compute resources, which doesn't seem directly related to your current problem.
Please check the dataset path and permission settings. Let me know if the issue persists! 😊
❔Question
I don't know why some images will be None when load_image().The database is normal.Every time,I train the model,I always encounter this question.
Additional context
File "train.py", line 460, in
train(hyp, opt, device, tbwriter)
File "train.py", line 243, in train
for i, (imgs, targets, paths, ) in pbar: # batch -------------------------------------------------------------
File "/usr/local/lib/python3.6/dist-packages/tqdm/std.py", line 1104, in iter
for obj in iterable:
File "/content/drive/LSB/yolov5-3.1/drive/LSB/yolov5-3.1/drive/LSB/yolov5-3.1/utils/datasets.py", line 91, in iter
yield next(self.iterator)
File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/dataloader.py", line 435, in next
data = self._next_data()
File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/dataloader.py", line 1065, in _next_data
return self._process_data(data)
File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/dataloader.py", line 1111, in _process_data
data.reraise()
File "/usr/local/lib/python3.6/dist-packages/torch/_utils.py", line 428, in reraise
raise self.exc_type(msg)
AssertionError: Caught AssertionError in DataLoader worker process 1.
Original Traceback (most recent call last):
File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/_utils/worker.py", line 198, in _worker_loop
data = fetcher.fetch(index)
File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/_utils/fetch.py", line 44, in
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/content/drive/LSB/yolov5-3.1/drive/LSB/yolov5-3.1/drive/LSB/yolov5-3.1/utils/datasets.py", line 525, in getitem
img, labels = load_mosaic(self, index)
File "/content/drive/LSB/yolov5-3.1/drive/LSB/yolov5-3.1/drive/LSB/yolov5-3.1/utils/datasets.py", line 655, in loadmosaic
img, , (h, w) = load_image(self, index)
File "/content/drive/LSB/yolov5-3.1/drive/LSB/yolov5-3.1/drive/LSB/yolov5-3.1/utils/datasets.py", line 616, in load_image
assert img is not None, 'Image Not Found ' + path
AssertionError: Image Not Found /content/drive/LSB/yolov5-3.1/data/LSB/images/train2020/396.jpg