meituan / YOLOv6

YOLOv6: a single-stage object detection framework dedicated to industrial applications.
GNU General Public License v3.0
5.72k stars 1.04k forks source link

CUDA error: invalid device ordinal #549

Closed asafberreby closed 2 years ago

asafberreby commented 2 years ago

Before Asking

Search before asking

Question

I am trying to run training on custom dataset and keep getting this exception. even when im trying to run it on computer with single GPU. any thoughts?

BTW: when im trying to train with CPU it works perfectly fine. image

Additional

No response

Chilicyy commented 2 years ago

Hi, you can check if PyTorch sees your devices correctly and that CUDA works. Try running this in the Python interpreter and seeing what it shows:

import torch
torch.__version__ # Get PyTorch and CUDA version
torch.cuda.is_available() # Check that CUDA works
torch.cuda.device_count() # Check how many CUDA capable devices you have

# Print device human readable names
torch.cuda.get_device_name(0)
torch.cuda.get_device_name(1)

If the devices exist and CUDA works, then it's probably just an issue with the ID you are using. You can also use CUDA_VISIBLE_DEVICES before the command to make sure that PyTorch can only see the specified device:

# Only make GPU ID 0 visible to PyTorch
CUDA_VISIBLE_DEVICES=0 python tools/train.py