ultralytics / yolov5

YOLOv5 🚀 in PyTorch > ONNX > CoreML > TFLite
https://docs.ultralytics.com
GNU Affero General Public License v3.0
49.82k stars 16.13k forks source link

pytorch's nvidia gpu device is not recognized. #12849

Closed rurusungoa closed 4 months ago

rurusungoa commented 6 months ago

Hello @glenn-jocher I'm sorry for keeping asking questions. I sincerely ask for your answer.

  1. device check

import torch print(torch.cuda.get_device_name(0))

--------------------------------------------------------------------------- AssertionError Traceback (most recent call last) Cell In[5], line 1 ----> 1 torch.cuda.get_device_name(0)

File /opt/conda/envs/python310/lib/python3.10/site-packages/torch/cuda/init.py:419, in get_device_name(device) 407 def get_device_name(device: Optional[_device_t] = None) -> str: 408 r"""Gets the name of a device. 409 410 Args: (...) 417 str: the name of the device 418 """ --> 419 return get_device_properties(device).name

File /opt/conda/envs/python310/lib/python3.10/site-packages/torch/cuda/init.py:449, in get_device_properties(device) 439 def get_device_properties(device: _device_t) -> _CudaDeviceProperties: 440 r"""Gets the properties of a device. 441 442 Args: (...) 447 _CudaDeviceProperties: the properties of the device 448 """ --> 449 _lazy_init() # will define _get_device_properties 450 device = _get_device_index(device, optional=True) 451 if device < 0 or device >= device_count():

File /opt/conda/envs/python310/lib/python3.10/site-packages/torch/cuda/init.py:289, in _lazy_init() 284 raise RuntimeError( 285 "Cannot re-initialize CUDA in forked subprocess. To use CUDA with " 286 "multiprocessing, you must use the 'spawn' start method" 287 ) 288 if not hasattr(torch._C, "_cuda_getDeviceCount"): --> 289 raise AssertionError("Torch not compiled with CUDA enabled") 290 if _cudart is None: 291 raise AssertionError( 292 "libcudart functions unavailable. It looks like you have a broken build?" 293 )

AssertionError: Torch not compiled with CUDA enabled

  1. Check whether it can be used with cuda

print(torch.cuda.is_available()) False

  1. List of installed packages

The python version is 3.10.14. Since I couldn't install it with cudatoolkit, I tried using pytorch-cuda as shown below. But the same error as cudatoolkit occurs.

conda install pytorch==2.1.2 torchvision==0.16.2 torchaudio==2.1.2 pytorch-cuda=11.8 -c pytorch -c nvidia

conda list | grep python

python 3.10.14 hd12c33a_0_cpython conda-forge

conda list | grep pytorch

libopenvino-pytorch-frontend 2023.3.0 h59595ed_3 conda-forge pytorch 2.1.2 cpu_mkl_py310h3ea73d3_100 conda-forge pytorch-cuda 12.1 ha16c6d3_5 pytorch pytorch-mutex 1.0 cuda pytorch torchaudio 2.1.2 py310_cu121 pytorch

conda list | grep torch

libopenvino-pytorch-frontend 2023.3.0 h59595ed_3 conda-forge libtorch 2.1.2 cpu_mkl_hadc400e_100 conda-forge pytorch 2.1.2 cpu_mkl_py310h3ea73d3_100 conda-forge pytorch-cuda 12.1 ha16c6d3_5 pytorch pytorch-mutex 1.0 cuda pytorch torchaudio 2.1.2 py310_cu121 pytorch torchdata 0.7.1 py310h0bd2ee8_3 conda-forge torchtext 0.15.2 py310h4e894d6_4 conda-forge torchvision 0.16.1 cpu_py310h684a773_3 conda-forge

conda list | grep cuda

cuda-cudart 12.1.105 0 nvidia cuda-cupti 12.1.105 0 nvidia cuda-libraries 12.1.0 0 nvidia cuda-nvrtc 12.1.105 0 nvidia cuda-nvtx 12.1.105 0 nvidia cuda-opencl 12.4.99 h59595ed_0 conda-forge cuda-runtime 12.1.0 0 nvidia cuda-version 12.4 h3060b56_3 conda-forge pytorch-cuda 12.1 ha16c6d3_5 pytorch pytorch-mutex 1.0 cuda pytorch

I installed the package as above, but can you check the error message? Please. I don't know why it doesn't work. help plz..(T.T) It's so sad

Originally posted by @rurusungoa in https://github.com/ultralytics/yolov5/issues/12812#issuecomment-2017592744

glenn-jocher commented 6 months ago

Hello @rurusungoa 👋,

No worries about the questions, that’s what we’re here for!

From your description, it appears that PyTorch is installed without CUDA support, as indicated by the "Torch not compiled with CUDA enabled" error. The key detail here is the pytorch package version you have installed: pytorch 2.1.2 cpu_mkl_py310h3ea73d3_100 conda-forge. This package is built for CPU only (cpu in the package name is the giveaway).

To resolve this, you need to install a version of PyTorch that supports CUDA. Here's a simplified step you can follow:

  1. Uninstall the current PyTorch and related packages:

    conda uninstall pytorch torchvision torchaudio pytorch-cuda
  2. Install PyTorch with CUDA support. Make sure to specify the CUDA version supported by your GPU and compatible with your nvidia-smi version. As of your message, for CUDA 12.1 support, you could use:

    conda install pytorch torchvision torchaudio cudatoolkit=12.1 -c pytorch

Note: Replace cudatoolkit=12.1 with the appropriate version for your setup if different.

Afterward, validate with torch.cuda.is_available() to ensure CUDA is now recognized.

For detailed installation instructions or further queries, our documentation is a great resource: https://docs.ultralytics.com/yolov5/

Hope this helps! 🚀

rurusungoa commented 6 months ago

Hello @glenn-jocher

Make sure your Conda environment is activated. Use the command you provided, which is generally correct, but double-check the PyTorch, CUDA, and NVIDIA channels for the latest versions and compatibility: => answer

When I checked the pytorch conda homepage, the following information appeared, so I proceeded with the installation.

When I checked on the pytorch homepage

https://pytorch.org/get-started/previous-versions/ v2.1.2 conda conda install pytorch==2.1.2 torchvision==0.16.2 torchaudio==2.1.2 pytorch-cuda=11.8 -c pytorch -c nvidia

The pytorch driver is installed as shown below.

pytorch 2.1.2 cpu_mkl_py310h3ea73d3_100 conda-forge

Install the image package on a regular PC Next, I checked this by accessing a server with a GPU. Do I need to perform CONDA INSTALL on a server with a GPU from the beginning?

glenn-jocher commented 6 months ago

@rurusungoa hello 👋,

Yes, to utilize GPU capabilities on a server, you should ensure that the PyTorch version installed is compatible with CUDA. The package installed (pytorch 2.1.2 cpu_mkl_py310h3ea73d3_100 conda-forge) is optimized for CPU only. For GPU support, you'd need to install a version of PyTorch built with CUDA support.

Here's a simplified step:

  1. Activate your Conda environment.

  2. Uninstall the CPU-only PyTorch version:

    conda uninstall pytorch torchvision torchaudio
  3. Install PyTorch with CUDA support by specifying cudatoolkit with the version that matches your system's CUDA. Assuming you need CUDA 11.8, you could use:

    conda install pytorch torchvision torchaudio cudatoolkit=11.8 -c pytorch

Make sure to adjust the cudatoolkit version based on the GPU and CUDA version available on your server.

This setup ensures that PyTorch utilizes the GPU, boosting performance for your tasks.

Let us know if you need any more help! 🚀

github-actions[bot] commented 5 months ago

👋 Hello there! We wanted to give you a friendly reminder that this issue has not had any recent activity and may be closed soon, but don't worry - you can always reopen it if needed. If you still have any questions or concerns, please feel free to let us know how we can help.

For additional resources and information, please see the links below:

Feel free to inform us of any other issues you discover or feature requests that come to mind in the future. Pull Requests (PRs) are also always welcomed!

Thank you for your contributions to YOLO 🚀 and Vision AI ⭐