Closed abdallah1989203 closed 1 year ago
👋 Hello @abdallah1989203, thank you for your interest in YOLOv5 🚀! Please visit our ⭐️ Tutorials to get started, where you can find quickstart guides for simple tasks like Custom Data Training all the way to advanced concepts like Hyperparameter Evolution.
If this is a 🐛 Bug Report, please provide a minimum reproducible example to help us debug it.
If this is a custom training ❓ Question, please provide as much information as possible, including dataset image examples and training logs, and verify you are following our Tips for Best Training Results.
Python>=3.7.0 with all requirements.txt installed including PyTorch>=1.7. To get started:
git clone https://github.com/ultralytics/yolov5 # clone
cd yolov5
pip install -r requirements.txt # install
YOLOv5 may be run in any of the following up-to-date verified environments (with all dependencies including CUDA/CUDNN, Python and PyTorch preinstalled):
If this badge is green, all YOLOv5 GitHub Actions Continuous Integration (CI) tests are currently passing. CI tests verify correct operation of YOLOv5 training, validation, inference, export and benchmarks on macOS, Windows, and Ubuntu every 24 hours and on every commit.
We're excited to announce the launch of our latest state-of-the-art (SOTA) object detection model for 2023 - YOLOv8 🚀!
Designed to be fast, accurate, and easy to use, YOLOv8 is an ideal choice for a wide range of object detection, image segmentation and image classification tasks. With YOLOv8, you'll be able to quickly and accurately detect objects in real-time, streamline your workflows, and achieve new levels of accuracy in your projects.
Check out our YOLOv8 Docs for details and get started with:
pip install ultralytics
@abdallah1989203 based on the error message you provided, it seems that there is an index error in the DataLoader while training with YOLOv5. The specific error message indicates that there is a mismatch in the dimensions of the indexed array when applying the random_perspective
augmentation.
One possible solution is to update your YOLOv5 repository to the latest version by using the command git pull
or cloning the repository again with git clone https://github.com/ultralytics/yolov5
. There have been 15 commits since your version, and updating to the latest version might resolve the issue.
If updating the repository doesn't solve the problem, you can try modifying the code in augmentations.py
in the random_perspective
function. Specifically, you can check the line where new_segments
is defined and ensure that the boolean index matches the dimensions of the indexed array correctly.
Please let us know if updating the repository or modifying the code resolves the issue.
Thanks @glenn-jocher.
@abdallah1989203 thanks for reporting this issue!
The error message suggests that there is a mismatch in the dimensions of the indexed array during the random_perspective
augmentation in the DataLoader. It may be caused by a bug in the code.
To resolve this issue, you can try updating your YOLOv5 repository to the latest version by using git pull
or cloning the repository again. There have been several commits since your version, and updating may fix the problem.
If updating doesn't solve the issue, you can modify the code in augmentations.py
within the random_perspective
function. Check the line where new_segments
is defined and ensure that the boolean index matches the dimensions of the indexed array correctly.
Please let us know if updating the repository or modifying the code fixes the problem.
Thanks again for bringing this to our attention, and we appreciate your contribution to YOLOv5!
Search before asking
YOLOv5 Component
Training, Multi-GPU
Bug
Epoch GPU_mem box_loss seg_loss obj_loss clsloss Instances Size 0%| | 0/257 [00:00<?, ?it/s]Traceback (most recent call last): File "segment/train.py", line 667, in
0%| | 0/257 [00:00<?, ?it/s]
Traceback (most recent call last):
File "segment/train.py", line 667, in
main(opt)
File "segment/train.py", line 558, in main
train(opt.hyp, opt, device, callbacks)
File "segment/train.py", line 287, in train
main(opt)
File "segment/train.py", line 558, in main
for i, (imgs, targets, paths, , masks) in pbar: # batch ------------------------------------------------------
File "/data/Yolo/yolov5/utils/dataloaders.py", line 172, in iter
yield next(self.iterator)
File "/home/mm/.local/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 628, in next
train(opt.hyp, opt, device, callbacks)
File "segment/train.py", line 287, in train
for i, (imgs, targets, paths, _, masks) in pbar: # batch ------------------------------------------------------
File "/home/mm/.local/lib/python3.7/site-packages/tqdm/std.py", line 1178, in iter
data = self._next_data()
File "/home/mm/.local/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 1333, in _next_data
for obj in iterable:
File "/data/Yolo/yolov5/utils/dataloaders.py", line 172, in iter
return self._process_data(data)
File "/home/mm/.local/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 1359, in _process_data
yield next(self.iterator)
File "/home/mm/.local/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 628, in next
data = self._next_data()
File "/home/mm/.local/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 1333, in _next_data
data.reraise()
File "/home/mm/.local/lib/python3.7/site-packages/torch/_utils.py", line 543, in reraise
raise exception
IndexError: Caught IndexError in DataLoader worker process 0.
Original Traceback (most recent call last):
File "/home/mm/.local/lib/python3.7/site-packages/torch/utils/data/_utils/worker.py", line 302, in _worker_loop
data = fetcher.fetch(index)
File "/home/mm/.local/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 58, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/home/mm/.local/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 58, in
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/data/Yolo/yolov5/utils/segment/dataloaders.py", line 115, in getitem
img, labels, segments = self.load_mosaic(index)
File "/data/Yolo/yolov5/utils/segment/dataloaders.py", line 263, in load_mosaic
border=self.mosaic_border) # border to remove
File "/data/Yolo/yolov5/utils/segment/augmentations.py", line 102, in random_perspective
new_segments = np.array(new_segments)[i]
IndexError: boolean index did not match indexed array along dimension 0; dimension is 0 but corresponding boolean dimension is 4
File "/home/mm/.local/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 1359, in _process_data data.reraise() File "/home/mm/.local/lib/python3.7/site-packages/torch/_utils.py", line 543, in reraise raise exception IndexError: Caught IndexError in DataLoader worker process 0. Original Traceback (most recent call last): File "/home/mm/.local/lib/python3.7/site-packages/torch/utils/data/_utils/worker.py", line 302, in _worker_loop data = fetcher.fetch(index) File "/home/mm/.local/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 58, in fetch data = [self.dataset[idx] for idx in possibly_batched_index] File "/home/mm/.local/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 58, in
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/data/Yolo/yolov5/utils/segment/dataloaders.py", line 115, in getitem
img, labels, segments = self.load_mosaic(index)
File "/data/Yolo/yolov5/utils/segment/dataloaders.py", line 263, in load_mosaic
border=self.mosaic_border) # border to remove
File "/data/Yolo/yolov5/utils/segment/augmentations.py", line 102, in random_perspective
new_segments = np.array(new_segments)[i]
IndexError: boolean index did not match indexed array along dimension 0; dimension is 0 but corresponding boolean dimension is 4
ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0 (pid: 57659) of binary: /opt/conda/bin/python Traceback (most recent call last): File "/opt/conda/lib/python3.7/runpy.py", line 193, in _run_module_as_main "main", mod_spec) File "/opt/conda/lib/python3.7/runpy.py", line 85, in _run_code exec(code, run_globals) File "/home/mm/.local/lib/python3.7/site-packages/torch/distributed/run.py", line 766, in
main()
File "/home/mm/.local/lib/python3.7/site-packages/torch/distributed/elastic/multiprocessing/errors/init.py", line 346, in wrapper
return f(*args, *kwargs)
File "/home/mm/.local/lib/python3.7/site-packages/torch/distributed/run.py", line 762, in main
run(args)
File "/home/mm/.local/lib/python3.7/site-packages/torch/distributed/run.py", line 756, in run
)(cmd_args)
File "/home/mm/.local/lib/python3.7/site-packages/torch/distributed/launcher/api.py", line 132, in call
return launch_agent(self._config, self._entrypoint, list(args))
File "/home/mm/.local/lib/python3.7/site-packages/torch/distributed/launcher/api.py", line 248, in launch_agent
failures=result.failures,
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:
segment/train.py FAILED
Failures: [1]: time : 2023-06-29_07:35:36 host : 5bc75d84eb31 rank : 1 (local_rank: 1) exitcode : 1 (pid: 57660) error_file: <N/A> traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html
Root Cause (first observed failure): [0]: time : 2023-06-29_07:35:36 host : 5bc75d84eb31 rank : 0 (local_rank: 0) exitcode : 1 (pid: 57659) error_file: <N/A> traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html
Environment
github: ⚠️ YOLOv5 is out of date by 15 commits. Use 'git pull' or 'git clone https://github.com/ultralytics/yolov5' to update. YOLOv5 🚀 v7.0-172-gc3c1304 Python-3.7.7 torch-1.13.1+cu117 CUDA:0 (NVIDIA GeForce GTX TITAN X, 12213MiB) CUDA:1 (NVIDIA GeForce GTX TITAN X, 12213MiB) -OS: Ubuntu -Python: 3.7.7
Minimal Reproducible Example
python -m torch.distributed.run --nproc_per_node 2 segment/train.py --device 0,1
Additional
can any one please help?
Are you willing to submit a PR?