ultralytics / ultralytics

NEW - YOLOv8 πŸš€ in PyTorch > ONNX > OpenVINO > CoreML > TFLite
https://docs.ultralytics.com
GNU Affero General Public License v3.0
28.46k stars 5.65k forks source link

NotImplementedError: Could not run 'torchvision::nms' with arguments from the 'CUDA' backend. #1774

Closed weertman closed 1 year ago

weertman commented 1 year ago

Search before asking

YOLOv8 Component

Detection

Bug

runcell(5, 'D:/StarSeg-Lab/src/Use_Yolov8_On_Video.py')
Ultralytics YOLOv8.0.32  Python-3.9.13 torch-1.13.1+cu117 CUDA:1 (NVIDIA RTX A6000, 49140MiB)
YOLOv8x-seg summary (fused): 295 layers, 71721619 parameters, 0 gradients, 343.7 GFLOPs

Traceback (most recent call last):

  File "D:\StarSeg-Lab\src\Use_Yolov8_On_Video.py", line 41, in <module>
    result = model.predict(img, device = '1')

  File "C:\Users\wlwee\anaconda3\lib\site-packages\ultralytics\yolo\engine\model.py", line 147, in predict
    return self.predictor.predict_cli(source=source) if is_cli else self.predictor(source=source, stream=stream)

  File "C:\Users\wlwee\anaconda3\lib\site-packages\torch\autograd\grad_mode.py", line 27, in decorate_context
    return func(*args, **kwargs)

  File "C:\Users\wlwee\anaconda3\lib\site-packages\ultralytics\yolo\engine\predictor.py", line 114, in __call__
    return list(self.stream_inference(source, model))  # merge list of Result into one

  File "C:\Users\wlwee\anaconda3\lib\site-packages\ultralytics\yolo\engine\predictor.py", line 169, in stream_inference
    self.results = self.postprocess(preds, im, im0s, self.classes)

  File "C:\Users\wlwee\anaconda3\lib\site-packages\ultralytics\yolo\v8\segment\predict.py", line 17, in postprocess
    p = ops.non_max_suppression(preds[0],

  File "C:\Users\wlwee\anaconda3\lib\site-packages\ultralytics\yolo\utils\ops.py", line 240, in non_max_suppression
    i = torchvision.ops.nms(boxes, scores, iou_thres)  # NMS

  File "C:\Users\wlwee\anaconda3\lib\site-packages\torchvision\ops\boxes.py", line 41, in nms
    return torch.ops.torchvision.nms(boxes, scores, iou_threshold)

  File "C:\Users\wlwee\anaconda3\lib\site-packages\torch\_ops.py", line 442, in __call__
    return self._op(*args, **kwargs or {})

NotImplementedError: Could not run 'torchvision::nms' with arguments from the 'CUDA' backend. This could be because the operator doesn't exist for this backend, or was omitted during the selective/custom build process (if using custom build). If you are a Facebook employee using PyTorch on mobile, please visit https://fburl.com/ptmfixes for possible resolutions. 'torchvision::nms' is only available for these backends: [CPU, QuantizedCPU, BackendSelect, Python, FuncTorchDynamicLayerBackMode, Functionalize, Named, Conjugate, Negative, ZeroTensor, ADInplaceOrView, AutogradOther, AutogradCPU, AutogradCUDA, AutogradXLA, AutogradMPS, AutogradXPU, AutogradHPU, AutogradLazy, Tracer, AutocastCPU, AutocastCUDA, FuncTorchBatched, FuncTorchVmapMode, Batched, VmapMode, FuncTorchGradWrapper, PythonTLSSnapshot, FuncTorchDynamicLayerFrontMode, PythonDispatcher].

CPU: registered at C:\tools\MINICO~1\CONDA-~2\TORCHV~1\work\torchvision\csrc\ops\cpu\nms_kernel.cpp:112 [kernel]
QuantizedCPU: registered at C:\tools\MINICO~1\CONDA-~2\TORCHV~1\work\torchvision\csrc\ops\quantized\cpu\qnms_kernel.cpp:124 [kernel]
BackendSelect: fallthrough registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\core\BackendSelectFallbackKernel.cpp:3 [backend fallback]
Python: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\core\PythonFallbackKernel.cpp:140 [backend fallback]
FuncTorchDynamicLayerBackMode: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\functorch\DynamicLayer.cpp:488 [backend fallback]
Functionalize: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\FunctionalizeFallbackKernel.cpp:291 [backend fallback]
Named: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\core\NamedRegistrations.cpp:7 [backend fallback]
Conjugate: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\ConjugateFallback.cpp:18 [backend fallback]
Negative: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\native\NegateFallback.cpp:18 [backend fallback]
ZeroTensor: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\ZeroTensorFallback.cpp:86 [backend fallback]
ADInplaceOrView: fallthrough registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\core\VariableFallbackKernel.cpp:64 [backend fallback]
AutogradOther: fallthrough registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\core\VariableFallbackKernel.cpp:35 [backend fallback]
AutogradCPU: fallthrough registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\core\VariableFallbackKernel.cpp:39 [backend fallback]
AutogradCUDA: fallthrough registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\core\VariableFallbackKernel.cpp:47 [backend fallback]
AutogradXLA: fallthrough registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\core\VariableFallbackKernel.cpp:51 [backend fallback]
AutogradMPS: fallthrough registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\core\VariableFallbackKernel.cpp:59 [backend fallback]
AutogradXPU: fallthrough registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\core\VariableFallbackKernel.cpp:43 [backend fallback]
AutogradHPU: fallthrough registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\core\VariableFallbackKernel.cpp:68 [backend fallback]
AutogradLazy: fallthrough registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\core\VariableFallbackKernel.cpp:55 [backend fallback]
Tracer: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\torch\csrc\autograd\TraceTypeManual.cpp:296 [backend fallback]
AutocastCPU: fallthrough registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\autocast_mode.cpp:482 [backend fallback]
AutocastCUDA: fallthrough registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\autocast_mode.cpp:324 [backend fallback]
FuncTorchBatched: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\functorch\LegacyBatchingRegistrations.cpp:743 [backend fallback]
FuncTorchVmapMode: fallthrough registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\functorch\VmapModeRegistrations.cpp:28 [backend fallback]
Batched: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\BatchingRegistrations.cpp:1064 [backend fallback]
VmapMode: fallthrough registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\VmapModeRegistrations.cpp:33 [backend fallback]
FuncTorchGradWrapper: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\functorch\TensorWrapper.cpp:189 [backend fallback]
PythonTLSSnapshot: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\core\PythonFallbackKernel.cpp:148 [backend fallback]
FuncTorchDynamicLayerFrontMode: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\functorch\DynamicLayer.cpp:484 [backend fallback]
PythonDispatcher: registered at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\core\PythonFallbackKernel.cpp:144 [backend fallback]

Environment

Ultralytics YOLOv8.0.32  Python-3.9.13 torch-1.13.1+cu117 CUDA:0 (NVIDIA RTX A6000, 49140MiB)
Setup complete  (64 CPUs, 1023.6 GB RAM, 626.7/7440.2 GB disk)
Python version: 3.9.13
Operating system: Windows
Operating system release: 10
Processor architecture: AMD64
import torch
import torchvision
print(torch.__version__)
print(torchvision.__version__)
1.13.1+cu117
0.14.1

torch.cuda.is_available()
Out[15]: True
Package                  Version
------------------------ --------------------
aiofiles                 22.1.0
aiosqlite                0.18.0
anyio                    3.6.2
argon2-cffi              21.3.0
argon2-cffi-bindings     21.2.0
arrow                    1.2.3
asttokens                2.2.1
attrs                    22.2.0
Babel                    2.11.0
backcall                 0.2.0
beautifulsoup4           4.11.2
bleach                   6.0.0
blosc2                   2.0.0
certifi                  2022.12.7
cffi                     1.15.1
chardet                  4.0.0
charset-normalizer       3.0.1
colorama                 0.4.6
comm                     0.1.2
contourpy                1.0.7
cycler                   0.10.0
Cython                   0.29.33
debugpy                  1.6.6
decorator                5.1.1
defusedxml               0.7.1
entrypoints              0.4
executing                1.2.0
fastjsonschema           2.16.2
filelock                 3.10.7
fonttools                4.39.0
fqdn                     1.5.1
idna                     2.10
imageio                  2.26.0
ipykernel                6.21.1
ipython                  8.9.0
ipython-genutils         0.2.0
isoduration              20.11.0
jedi                     0.18.2
Jinja2                   3.1.2
joblib                   1.2.0
json5                    0.9.11
jsonpointer              2.3
jsonschema               4.17.3
jupyter_client           7.4.1
jupyter_core             5.2.0
jupyter-events           0.5.0
jupyter-server           1.23.5
jupyter_server_fileid    0.6.0
jupyter_server_terminals 0.4.4
jupyter_server_ydoc      0.6.1
jupyter-ydoc             0.2.2
jupyterlab               3.6.1
jupyterlab-pygments      0.2.2
jupyterlab_server        2.19.0
kiwisolver               1.4.4
lazy_loader              0.1
MarkupSafe               2.1.2
matplotlib               3.7.1
matplotlib-inline        0.1.6
mistune                  2.0.5
mpmath                   1.3.0
msgpack                  1.0.5
nbclassic                0.5.1
nbclient                 0.7.2
nbconvert                7.2.9
nbformat                 5.7.3
nest-asyncio             1.5.6
networkx                 3.0
notebook                 6.5.2
notebook_shim            0.2.2
numexpr                  2.8.4
numpy                    1.24.2
opencv-contrib-python    4.7.0.68
opencv-python            4.7.0.72
packaging                23.0
pandas                   1.5.3
pandocfilters            1.5.0
parso                    0.8.3
pickleshare              0.7.5
Pillow                   9.4.0
pip                      23.0.1
platformdirs             3.0.0
prometheus-client        0.16.0
prompt-toolkit           3.0.36
psutil                   5.9.4
pure-eval                0.2.2
py-cpuinfo               9.0.0
pycparser                2.21
Pygments                 2.14.0
pyparsing                2.4.7
pyrsistent               0.19.3
python-dateutil          2.8.2
python-dotenv            1.0.0
python-json-logger       2.0.4
pytz                     2022.7.1
PyWavelets               1.4.1
pywin32                  305
pywinpty                 2.0.10
PyYAML                   6.0
pyzmq                    25.0.0
requests                 2.28.2
requests-toolbelt        0.10.1
rfc3339-validator        0.1.4
rfc3986-validator        0.1.1
roboflow                 1.0.2
scikit-image             0.20.0
scikit-learn             1.2.1
scipy                    1.10.0
seaborn                  0.12.2
Send2Trash               1.8.0
sentry-sdk               1.18.0
setuptools               65.5.0
six                      1.16.0
sniffio                  1.3.0
soupsieve                2.3.2.post1
stack-data               0.6.2
sympy                    1.11.1
tables                   3.8.0
terminado                0.17.1
thop                     0.1.1.post2209072238
threadpoolctl            3.1.0
tifffile                 2023.2.28
tinycss2                 1.2.1
torch                    2.0.0
torchvision              0.15.1
tornado                  6.2
tqdm                     4.65.0
traitlets                5.9.0
typing_extensions        4.5.0
ultralytics              8.0.59
uri-template             1.2.0
urllib3                  1.26.14
voila                    0.4.0
wcwidth                  0.2.6
webcolors                1.12
webencodings             0.5.1
websocket-client         1.5.1
websockets               10.4
wget                     3.2
y-py                     0.5.5
ypy-websocket            0.8.2
(base) C:\Users\wlwee>nvidia-smi
Sun Apr  2 12:00:34 2023
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 528.49       Driver Version: 528.49       CUDA Version: 12.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name            TCC/WDDM | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA RTX A6000   WDDM  | 00000000:18:00.0  On |                  Off |
| 31%   60C    P0    65W / 300W |   1891MiB / 49140MiB |      2%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   1  NVIDIA RTX A6000   WDDM  | 00000000:51:00.0 Off |                  Off |
| 30%   28C    P8     8W / 300W |      0MiB / 49140MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A      3944    C+G   ...e\PhoneExperienceHost.exe    N/A      |
|    0   N/A  N/A      4224    C+G   ...in7x64\steamwebhelper.exe    N/A      |
|    0   N/A  N/A      7756    C+G   C:\Windows\explorer.exe         N/A      |
|    0   N/A  N/A      9396    C+G   ...5n1h2txyewy\SearchApp.exe    N/A      |
|    0   N/A  N/A     10424    C+G   ...me\Application\chrome.exe    N/A      |
|    0   N/A  N/A     10472    C+G   ...cw5n1h2txyewy\LockApp.exe    N/A      |
|    0   N/A  N/A     10952    C+G   ...y\ShellExperienceHost.exe    N/A      |
|    0   N/A  N/A     15224    C+G   ...bbwe\Microsoft.Photos.exe    N/A      |
|    0   N/A  N/A     16196    C+G   ...2txyewy\TextInputHost.exe    N/A      |
|    0   N/A  N/A     17180    C+G   ...8wekyb3d8bbwe\Cortana.exe    N/A      |
|    0   N/A  N/A     19716    C+G   ...lPanel\SystemSettings.exe    N/A      |
|    0   N/A  N/A     21884    C+G   ..._dt26b99r8h8gj\RtkUWP.exe    N/A      |
+-----------------------------------------------------------------------------+

Minimal Reproducible Example

import ultralytics
from ultralytics import YOLO
import os
import cv2
from tqdm import tqdm

#%%

ultralytics.checks()

#%%

model = YOLO(r'D:/StarSeg-Lab/projects/runs/segment/train/weights/best.pt')

#%%
path_video = r'D:/StarSeg-Lab/data/Timelapses/PycnoMonitor_3per/3per_week0_baseline.mp4'
cap = cv2.VideoCapture(path_video)
l = int(cap.get(cv2.CAP_PROP_FRAME_COUNT))
zstack = []
pbar = tqdm(total = l, position=0, leave=True)
for i in range(0,l):
    success, img = cap.read()
    if success != True:
        print(f'{i} failed to read {path_video}')
    else:
        zstack.append(img)
    pbar.update(n=1)
pbar.close()
cap.release()
#%%

img = zstack[0]
result = model.predict(img, device = '1') ## same error with either gpu

Additional

No response

Are you willing to submit a PR?

glenn-jocher commented 1 year ago

@weertman it seems that the issue is related to torchvision's non_max_suppression (nms) function not being able to run on the CUDA backend.

To resolve the issue, try to run the model on the CPU instead of the GPU. You can do this by changing the device parameter in the model.predict() function from '1' to 'cpu'.

Alternatively, you can try downgrading your torchvision version to 0.10.1 and torch version to 1.10.1 which are known to work well together.

weertman commented 1 year ago

Okay. I updated the requirements.txt file with

torch>=1.10.1
torchvision>=0.10.1

installed torch using

conda install pytorch torchvision torchaudio pytorch-cuda=11.7 -c pytorch -c nvidia

Then I started getting this error:

OMP: Error #15: Initializing libiomp5md.dll, but found libiomp5md.dll already initialized. OMP: Hint This means that multiple copies of the OpenMP runtime have been linked into the program. That is dangerous, since it can degrade performance or cause incorrect results. The best thing to do is to ensure that only a single OpenMP runtime is linked into the process, e.g. by avoiding static linking of the OpenMP runtime in any library. As an unsafe, unsupported, undocumented workaround you can set the environment variable KMP_DUPLICATE_LIB_OK=TRUE to allow the program to continue to execute, but that may cause crashes or silently produce incorrect results. For more information, please see http://www.intel.com/software/products/support/. Fatal Python error: Aborted Thread 0x0000269c (most recent call first): File "C:\Users\wlwee\anaconda3\envs\yolov8\lib\threading.py", line 316 in wait File "C:\Users\wlwee\anaconda3\envs\yolov8\lib\threading.py", line 581 in wait File "C:\Users\wlwee\anaconda3\envs\yolov8\lib\site-packages\tqdm\_monitor.py", line 60 in run File "C:\Users\wlwee\anaconda3\envs\yolov8\lib\threading.py", line 980 in _bootstrap_inner File "C:\Users\wlwee\anaconda3\envs\yolov8\lib\threading.py", line 937 in _bootstrap Main thread: Current thread 0x00002470 (most recent call first): File "<__array_function__ internals>", line 180 in dot File "C:\Users\wlwee\anaconda3\envs\yolov8\lib\site-packages\ultralytics\tracker\utils\kalman_filter.py", line 384 in multi_predict File "C:\Users\wlwee\anaconda3\envs\yolov8\lib\site-packages\ultralytics\tracker\trackers\bot_sort.py", line 76 in multi_predict File "C:\Users\wlwee\anaconda3\envs\yolov8\lib\site-packages\ultralytics\tracker\trackers\bot_sort.py", line 136 in multi_predict File "C:\Users\wlwee\anaconda3\envs\yolov8\lib\site-packages\ultralytics\tracker\trackers\byte_tracker.py", line 210 in update File "C:\Users\wlwee\anaconda3\envs\yolov8\lib\site-packages\ultralytics\tracker\track.py", line 33 in on_predict_postprocess_end File "C:\Users\wlwee\anaconda3\envs\yolov8\lib\site-packages\ultralytics\yolo\engine\predictor.py", line 284 in run_callbacks File "C:\Users\wlwee\anaconda3\envs\yolov8\lib\site-packages\ultralytics\yolo\engine\predictor.py", line 194 in stream_inference File "C:\Users\wlwee\anaconda3\envs\yolov8\lib\site-packages\torch\utils\_contextlib.py", line 56 in generator_context File "C:\Users\wlwee\anaconda3\envs\yolov8\lib\site-packages\ultralytics\yolo\engine\predictor.py", line 127 in __call__ File "C:\Users\wlwee\anaconda3\envs\yolov8\lib\site-packages\ultralytics\yolo\engine\model.py", line 238 in predict File "C:\Users\wlwee\anaconda3\envs\yolov8\lib\site-packages\torch\utils\_contextlib.py", line 115 in decorate_context File "C:\Users\wlwee\anaconda3\envs\yolov8\lib\site-packages\ultralytics\yolo\engine\model.py", line 248 in track File "d:\starseg-lab\src\use_yolov8_on_video.py", line 53 in <module> File "C:\Users\wlwee\anaconda3\envs\yolov8\lib\site-packages\spyder_kernels\py3compat.py", line 356 in compat_exec File "C:\Users\wlwee\anaconda3\envs\yolov8\lib\site-packages\spyder_kernels\customize\spydercustomize.py", line 473 in exec_code File "C:\Users\wlwee\anaconda3\envs\yolov8\lib\site-packages\spyder_kernels\customize\spydercustomize.py", line 615 in _exec_file File "C:\Users\wlwee\anaconda3\envs\yolov8\lib\site-packages\spyder_kernels\customize\spydercustomize.py", line 528 in runfile File "C:\Users\wlwee\AppData\Local\Temp\ipykernel_4936\2823337637.py", line 1 in <module> Restarting kernel...

Which I was able to fix using advice from this stack overflow

https://stackoverflow.com/questions/20554074/sklearn-omp-error-15-initializing-libiomp5md-dll-but-found-mk2iomp5md-dll-a

import os
os.environ["KMP_DUPLICATE_LIB_OK"]="TRUE"

It seems like there is some dependency issues coming from the install instructions provided.

But my GPU now works and it tracking at a rate of 1 frame per ~20 ms :)

github-actions[bot] commented 1 year ago

πŸ‘‹ Hello there! We wanted to give you a friendly reminder that this issue has not had any recent activity and may be closed soon, but don't worry - you can always reopen it if needed. If you still have any questions or concerns, please feel free to let us know how we can help.

For additional resources and information, please see the links below:

Feel free to inform us of any other issues you discover or feature requests that come to mind in the future. Pull Requests (PRs) are also always welcomed!

Thank you for your contributions to YOLO πŸš€ and Vision AI ⭐

bbq52 commented 10 months ago

i have the same issue if use torchvision no cuda. i fixxed it by: cuda==11.7 torch==2.0.0 torchvision-cu==0.15.0

glenn-jocher commented 10 months ago

@bbq52 it's great to hear that you managed to resolve the issue by aligning the versions of CUDA, PyTorch, and torchvision for compatibility. Version conflicts between CUDA, PyTorch, and torchvision can often lead to issues like the one you experienced, so ensuring they are compatible is important for stable operation.

Remember to always consult the PyTorch website or release notes for compatible versions of these libraries when setting up your environment or upgrading packages. This attention to detail can help to minimize version-related issues and ensure smooth operation of your AI models. If you encounter any other issues or have further questions about YOLOv8, feel free to reach out.

NaufalGhifari commented 9 months ago

How I fixed this issue

I just had a very similar issue. My problem was that PyTorch's pip install command: pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121 does not install torchvision with CUDA.

This is how my pip freeze looked like (note the missing '+cu121'): torch==2.1.0+cu121 torchaudio==2.1.0+cu121 torchvision==0.16.0

So after executing the command above, I had to explicitly install torchvision+cuda using pip install torchvision==0.16.0+cu121 -f https://download.pytorch.org/whl/torch_stable.html.

Here's where I got the idea from: https://discuss.pytorch.org/t/notimplementederror-could-not-run-torchvision-nms-with-arguments-from-the-cuda-backend-this-could-be-because-the-operator-doesnt-exist-for-this-backend/132352

Afterwards, my pip freeze looks as follows and I no longer have the error: torch==2.1.0+cu121 torchaudio==2.1.0+cu121 torchvision==0.16.0+cu121

Make sure you install the correct version for your case. Hope this helps!

glenn-jocher commented 9 months ago

@NaufalGhifari thank you for detailing the steps you took to solve the issue. It's important for users to verify whether torchvision is installed with CUDA support, as this can lead to the NotImplementedError when CUDA-based operations like non-maximum suppression are called.

Your solution to install the CUDA-specific version of torchvision after your initial setup is a good reminder that sometimes you need to manually ensure that compatible CUDA versions are explicitly installed.

For those reading this and needing to resolve similar issues: always check the versions of torch, torchvision, and torchaudio to ensure they match your CUDA version. The PyTorch website provides a handy tool for generating the correct pip or conda install commands based on your specific environment settings, such as OS, package manager, Python version, and CUDA version.

Remember that maintaining alignment between the versions of these libraries is crucial for the proper functioning of CUDA-dependent PyTorch functionalities, such as those employed in YOLOv8. Keep an eye on the compatibility to avoid runtime errors and to utilize your GPU resources effectively. Your experience will certainly help others encountering the same problem.

doubtfire009 commented 6 months ago

How I fixed this issue

I just had a very similar issue. My problem was that PyTorch's pip install command: pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121 does not install torchvision with CUDA.

This is how my pip freeze looked like (note the missing '+cu121'): torch==2.1.0+cu121 torchaudio==2.1.0+cu121 torchvision==0.16.0

So after executing the command above, I had to explicitly install torchvision+cuda using pip install torchvision==0.16.0+cu121 -f https://download.pytorch.org/whl/torch_stable.html.

Here's where I got the idea from: https://discuss.pytorch.org/t/notimplementederror-could-not-run-torchvision-nms-with-arguments-from-the-cuda-backend-this-could-be-because-the-operator-doesnt-exist-for-this-backend/132352

Afterwards, my pip freeze looks as follows and I no longer have the error: torch==2.1.0+cu121 torchaudio==2.1.0+cu121 torchvision==0.16.0+cu121

Make sure you install the correct version for your case. Hope this helps!

Thank you! You are a life saver! This answer fixed my issue.

glenn-jocher commented 6 months ago

@doubtfire009 we're thrilled to hear you got your issue resolved! πŸŽ‰ Matching versions of torch, torchvision, and CUDA can indeed be tricky, but it looks like you navigated it perfectly. If you or anyone else runs into similar hiccups, remember to check those version alignments closely. Also, the PyTorch discussion forums and documentation are gold mines for troubleshooting these sorts of issues. Thanks for sharing your fix with the community, and happy coding with YOLOv8! πŸš€