ultralytics / yolov5

YOLOv5 🚀 in PyTorch > ONNX > CoreML > TFLite
https://docs.ultralytics.com
GNU Affero General Public License v3.0
49.51k stars 16.08k forks source link

gpu memory usage is low but out of memory #13216

Open leooobreak opened 1 month ago

leooobreak commented 1 month ago

Search before asking

Question

When training yolov5, I found that the GPU memory was not used, GPU memory usage is versy low, and the memory increased very quickly, resulting in out of memory. I would like to know how to use the GPU for training?
40a297508ad327424914be98cf4f7fdc

command is python .\train.py --device 0 --epochs 1 --batch-size 16

image

Additional

No response

github-actions[bot] commented 1 month ago

👋 Hello @leooobreak, thank you for your interest in YOLOv5 🚀! Please visit our ⭐️ Tutorials to get started, where you can find quickstart guides for simple tasks like Custom Data Training all the way to advanced concepts like Hyperparameter Evolution.

If this is a 🐛 Bug Report, please provide a minimum reproducible example to help us debug it.

If this is a custom training ❓ Question, please provide as much information as possible, including dataset image examples and training logs, and verify you are following our Tips for Best Training Results.

Requirements

Python>=3.8.0 with all requirements.txt installed including PyTorch>=1.8. To get started:

git clone https://github.com/ultralytics/yolov5  # clone
cd yolov5
pip install -r requirements.txt  # install

Environments

YOLOv5 may be run in any of the following up-to-date verified environments (with all dependencies including CUDA/CUDNN, Python and PyTorch preinstalled):

Status

YOLOv5 CI

If this badge is green, all YOLOv5 GitHub Actions Continuous Integration (CI) tests are currently passing. CI tests verify correct operation of YOLOv5 training, validation, inference, export and benchmarks on macOS, Windows, and Ubuntu every 24 hours and on every commit.

Introducing YOLOv8 🚀

We're excited to announce the launch of our latest state-of-the-art (SOTA) object detection model for 2023 - YOLOv8 🚀!

Designed to be fast, accurate, and easy to use, YOLOv8 is an ideal choice for a wide range of object detection, image segmentation and image classification tasks. With YOLOv8, you'll be able to quickly and accurately detect objects in real-time, streamline your workflows, and achieve new levels of accuracy in your projects.

Check out our YOLOv8 Docs for details and get started with:

pip install ultralytics
glenn-jocher commented 1 month ago

@leooobreak hi there!

Thank you for reaching out and providing detailed information about your issue. It sounds like you're encountering an unusual problem where the GPU memory usage is low, but you're still running out of memory. Here are a few steps you can take to troubleshoot and potentially resolve this issue:

  1. Verify Environment and Dependencies: Ensure that you are using the latest versions of YOLOv5 and its dependencies. You can update your repository and install the latest requirements with the following commands:

    git pull
    pip install -r requirements.txt
  2. Check GPU Utilization: Sometimes, the GPU might not be fully utilized due to various reasons. You can monitor GPU usage using nvidia-smi to see if the GPU is being properly utilized during training.

  3. Adjust Batch Size: The batch size you are using might be too large for your GPU memory. Try reducing the batch size to see if it helps:

    python train.py --device 0 --epochs 1 --batch-size 8
  4. Use Mixed Precision Training: Mixed precision training can help reduce memory usage and speed up training. You can enable it by adding the --half flag:

    python train.py --device 0 --epochs 1 --batch-size 16 --half
  5. Distributed Data Parallel (DDP) Training: If you have multiple GPUs, consider using DistributedDataParallel (DDP) mode for better memory management and performance:

    python -m torch.distributed.run --nproc_per_node 2 train.py --batch 32 --device 0,1
  6. Check for Memory Leaks: Ensure that there are no memory leaks in your code or data pipeline. Sometimes, custom data loaders or transformations can cause memory issues.

If the issue persists, please provide more details about your setup, such as the GPU model, CUDA version, and PyTorch version. This information can help in diagnosing the problem more accurately.

Feel free to reach out if you have any further questions or need additional assistance. The YOLO community and the Ultralytics team are here to help! 🚀