ultralytics / yolov5

YOLOv5 🚀 in PyTorch > ONNX > CoreML > TFLite
https://docs.ultralytics.com
GNU Affero General Public License v3.0
51.02k stars 16.41k forks source link

training "Memory Error" on Window #13390

Open suws0501 opened 3 weeks ago

suws0501 commented 3 weeks ago

Search before asking

YOLOv5 Component

No response

Bug

I tried to run the training app of yours on my Window machine. it has just loaded some stuffs, moving stuffs around, a few caches... then it crashed.

`(yolo) C:\Users\baoth\OneDrive\Desktop\yolo\yolov5>python train.py --epochs 10 --img 640 --batch 16 --data ../data.yaml --weights yolov5s.pt train: weights=yolov5s.pt, cfg=, data=../data.yaml, hyp=data\hyps\hyp.scratch-low.yaml, epochs=10, batch_size=16, imgsz=640, rect=False, resume=False, nosave=Fal se, noval=False, noautoanchor=False, noplots=False, evolve=None, evolve_population=data\hyps, resume_evolve=None, bucket=, cache=None, image_weights=False, devic e=, multi_scale=False, single_cls=False, optimizer=SGD, sync_bn=False, workers=8, project=runs\train, name=exp, exist_ok=False, quad=False, cos_lr=False, label_s moothing=0.0, patience=100, freeze=[0], save_period=-1, seed=0, local_rank=-1, entity=None, upload_dataset=False, bbox_interval=-1, artifact_alias=latest, ndjson_console=False, ndjson_file=False github: up to date with https://github.com/ultralytics/yolov5 YOLOv5 v7.0-378-g2f74455a Python-3.12.4 torch-2.5.0+cpu CPU

hyperparameters: lr0=0.01, lrf=0.01, momentum=0.937, weight_decay=0.0005, warmup_epochs=3.0, warmup_momentum=0.8, warmup_bias_lr=0.1, box=0.05, cls=0.5, cls_pw=1 .0, obj=1.0, obj_pw=1.0, iou_t=0.2, anchor_t=4.0, fl_gamma=0.0, hsv_h=0.015, hsv_s=0.7, hsv_v=0.4, degrees=0.0, translate=0.1, scale=0.5, shear=0.0, perspective=0.0, flipud=0.0, fliplr=0.5, mosaic=1.0, mixup=0.0, copy_paste=0.0 Comet: run 'pip install comet_ml' to automatically track and visualize YOLOv5 runs in Comet TensorBoard: Start with 'tensorboard --logdir runs\train', view at http://localhost:6006/ Overriding model.yaml nc=80 with nc=3

             from  n    params  module                                  arguments

0 -1 1 3520 models.common.Conv [3, 32, 6, 2, 2] 1 -1 1 18560 models.common.Conv [32, 64, 3, 2] 2 -1 1 18816 models.common.C3 [64, 64, 1]
3 -1 1 73984 models.common.Conv [64, 128, 3, 2] 4 -1 2 115712 models.common.C3 [128, 128, 2] 5 -1 1 295424 models.common.Conv [128, 256, 3, 2] 6 -1 3 625152 models.common.C3 [256, 256, 3]
7 -1 1 1180672 models.common.Conv [256, 512, 3, 2] 8 -1 1 1182720 models.common.C3 [512, 512, 1]
9 -1 1 656896 models.common.SPPF [512, 512, 5] 10 -1 1 131584 models.common.Conv [512, 256, 1, 1] 11 -1 1 0 torch.nn.modules.upsampling.Upsample [None, 2, 'nearest'] 12 [-1, 6] 1 0 models.common.Concat [1] 13 -1 1 361984 models.common.C3 [512, 256, 1, False] 14 -1 1 33024 models.common.Conv [256, 128, 1, 1] 15 -1 1 0 torch.nn.modules.upsampling.Upsample [None, 2, 'nearest']
16 [-1, 4] 1 0 models.common.Concat [1] 17 -1 1 90880 models.common.C3 [256, 128, 1, False] 18 -1 1 147712 models.common.Conv [128, 128, 3, 2] 19 [-1, 14] 1 0 models.common.Concat [1] 20 -1 1 296448 models.common.C3 [256, 256, 1, False] 21 -1 1 590336 models.common.Conv [256, 256, 3, 2] 22 [-1, 10] 1 0 models.common.Concat [1] 23 -1 1 1182720 models.common.C3 [512, 512, 1, False]
24 [17, 20, 23] 1 21576 models.yolo.Detect [3, [[10, 13, 16, 30, 33, 23], [30, 61, 62, 45, 59, 119], [116, 90, 156, 198, 373, 326]], [128, 256, 512]] Model summary: 214 layers, 7027720 parameters, 7027720 gradients, 16.0 GFLOPs

Transferred 343/349 items from yolov5s.pt optimizer: SGD(lr=0.01) with parameter groups 57 weight(decay=0.0), 60 weight(decay=0.0005), 60 bias train: Scanning C:\Users\baoth\OneDrive\Desktop\yolo\train\labels.cache... 996 images, 0 backgrounds, 0 corrupt: 100%|██████████| 996/996 [00:00<?, ?it/s] val: Scanning C:\Users\baoth\OneDrive\Desktop\yolo\valid\labels.cache... 61 images, 0 backgrounds, 0 corrupt: 100%|██████████| 61/61 [00:00<?, ?it/s] Traceback (most recent call last): File "", line 1, in File "C:\Users\baoth\miniconda3\Lib\multiprocessing\spawn.py", line 122, in spawn_main exitcode = _main(fd, parent_sentinel) ^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\baoth\miniconda3\Lib\multiprocessing\spawn.py", line 131, in _main prepare(preparation_data) File "C:\Users\baoth\miniconda3\Lib\multiprocessing\spawn.py", line 246, in prepare _fixup_main_from_path(data['init_main_from_path']) File "C:\Users\baoth\miniconda3\Lib\multiprocessing\spawn.py", line 297, in _fixup_main_from_path main_content = runpy.run_path(main_path, ^^^^^^^^^^^^^^^^^^^^^^^^^ File "", line 286, in run_path File "", line 98, in _run_module_code File "", line 88, in _run_code File "C:\Users\baoth\OneDrive\Desktop\yolo\yolov5\train.py", line 47, in import val as validate # for end-of-epoch mAP ^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\baoth\OneDrive\Desktop\yolo\yolov5\val.py", line 60, in from utils.plots import output_to_target, plot_images, plot_val_study File "C:\Users\baoth\OneDrive\Desktop\yolo\yolov5\utils\plots.py", line 15, in import seaborn as sn File "C:\Users\baoth\miniconda3\Lib\site-packages\seaborn__init__.py", line 7, in from .categorical import # noqa: F401,F403 ^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\baoth\miniconda3\Lib\site-packages\seaborn\categorical.py", line 19, in from seaborn._stats.density import KDE File "C:\Users\baoth\miniconda3\Lib\site-packages\seaborn_stats\density.py", line 10, in from scipy.stats import gaussian_kde File "C:\Users\baoth\miniconda3\Lib\site-packages\scipy\stats__init__.py", line 610, in from ._stats_py import File "", line 1360, in _find_and_load File "", line 1331, in _find_and_load_unlocked File "", line 935, in _load_unlocked File "", line 991, in exec_module File "", line 1087, in get_code File "", line 1187, in get_data MemoryError `

Environment

yolov5s, Window, no cuda

Minimal Reproducible Example

No response

Additional

No response

Are you willing to submit a PR?

UltralyticsAssistant commented 3 weeks ago

👋 Hello @suws0501, thank you for reaching out about your issue with YOLOv5 🚀! It looks like you're encountering a "Memory Error" while running training on your Windows machine.

As this seems to be a 🐛 Bug Report, could you please provide a minimum reproducible example to help us understand the problem better? This should include the exact command you're running along with any modifications you've made to the code or configuration.

In the meantime, please ensure your system meets the following requirements:

For execution environments, YOLOv5 can typically be run on multiple verified setups such as cloud platforms or with Docker to alleviate local resource constraints.

This is an automated response, but an Ultralytics engineer will look into your situation soon. In the meantime, feel free to add any additional information that might help us assist you. 😊

pderrenger commented 1 week ago

@suws0501 it seems like you're encountering a memory error during training on Windows. Please ensure you're using the latest version of YOLOv5 and try reducing the batch size to see if it resolves the issue. If the problem persists, consider using a machine with more RAM or leveraging a cloud-based solution with GPU support.