Closed weertman closed 1 year ago
@weertman it seems that the issue is related to torchvision's non_max_suppression (nms) function not being able to run on the CUDA backend.
To resolve the issue, try to run the model on the CPU instead of the GPU. You can do this by changing the device parameter in the model.predict() function from '1' to 'cpu'.
Alternatively, you can try downgrading your torchvision version to 0.10.1 and torch version to 1.10.1 which are known to work well together.
Okay. I updated the requirements.txt file with
torch>=1.10.1
torchvision>=0.10.1
installed torch using
conda install pytorch torchvision torchaudio pytorch-cuda=11.7 -c pytorch -c nvidia
Then I started getting this error:
OMP: Error #15: Initializing libiomp5md.dll, but found libiomp5md.dll already initialized. OMP: Hint This means that multiple copies of the OpenMP runtime have been linked into the program. That is dangerous, since it can degrade performance or cause incorrect results. The best thing to do is to ensure that only a single OpenMP runtime is linked into the process, e.g. by avoiding static linking of the OpenMP runtime in any library. As an unsafe, unsupported, undocumented workaround you can set the environment variable KMP_DUPLICATE_LIB_OK=TRUE to allow the program to continue to execute, but that may cause crashes or silently produce incorrect results. For more information, please see http://www.intel.com/software/products/support/. Fatal Python error: Aborted Thread 0x0000269c (most recent call first): File "C:\Users\wlwee\anaconda3\envs\yolov8\lib\threading.py", line 316 in wait File "C:\Users\wlwee\anaconda3\envs\yolov8\lib\threading.py", line 581 in wait File "C:\Users\wlwee\anaconda3\envs\yolov8\lib\site-packages\tqdm\_monitor.py", line 60 in run File "C:\Users\wlwee\anaconda3\envs\yolov8\lib\threading.py", line 980 in _bootstrap_inner File "C:\Users\wlwee\anaconda3\envs\yolov8\lib\threading.py", line 937 in _bootstrap Main thread: Current thread 0x00002470 (most recent call first): File "<__array_function__ internals>", line 180 in dot File "C:\Users\wlwee\anaconda3\envs\yolov8\lib\site-packages\ultralytics\tracker\utils\kalman_filter.py", line 384 in multi_predict File "C:\Users\wlwee\anaconda3\envs\yolov8\lib\site-packages\ultralytics\tracker\trackers\bot_sort.py", line 76 in multi_predict File "C:\Users\wlwee\anaconda3\envs\yolov8\lib\site-packages\ultralytics\tracker\trackers\bot_sort.py", line 136 in multi_predict File "C:\Users\wlwee\anaconda3\envs\yolov8\lib\site-packages\ultralytics\tracker\trackers\byte_tracker.py", line 210 in update File "C:\Users\wlwee\anaconda3\envs\yolov8\lib\site-packages\ultralytics\tracker\track.py", line 33 in on_predict_postprocess_end File "C:\Users\wlwee\anaconda3\envs\yolov8\lib\site-packages\ultralytics\yolo\engine\predictor.py", line 284 in run_callbacks File "C:\Users\wlwee\anaconda3\envs\yolov8\lib\site-packages\ultralytics\yolo\engine\predictor.py", line 194 in stream_inference File "C:\Users\wlwee\anaconda3\envs\yolov8\lib\site-packages\torch\utils\_contextlib.py", line 56 in generator_context File "C:\Users\wlwee\anaconda3\envs\yolov8\lib\site-packages\ultralytics\yolo\engine\predictor.py", line 127 in __call__ File "C:\Users\wlwee\anaconda3\envs\yolov8\lib\site-packages\ultralytics\yolo\engine\model.py", line 238 in predict File "C:\Users\wlwee\anaconda3\envs\yolov8\lib\site-packages\torch\utils\_contextlib.py", line 115 in decorate_context File "C:\Users\wlwee\anaconda3\envs\yolov8\lib\site-packages\ultralytics\yolo\engine\model.py", line 248 in track File "d:\starseg-lab\src\use_yolov8_on_video.py", line 53 in <module> File "C:\Users\wlwee\anaconda3\envs\yolov8\lib\site-packages\spyder_kernels\py3compat.py", line 356 in compat_exec File "C:\Users\wlwee\anaconda3\envs\yolov8\lib\site-packages\spyder_kernels\customize\spydercustomize.py", line 473 in exec_code File "C:\Users\wlwee\anaconda3\envs\yolov8\lib\site-packages\spyder_kernels\customize\spydercustomize.py", line 615 in _exec_file File "C:\Users\wlwee\anaconda3\envs\yolov8\lib\site-packages\spyder_kernels\customize\spydercustomize.py", line 528 in runfile File "C:\Users\wlwee\AppData\Local\Temp\ipykernel_4936\2823337637.py", line 1 in <module> Restarting kernel...
Which I was able to fix using advice from this stack overflow
import os
os.environ["KMP_DUPLICATE_LIB_OK"]="TRUE"
It seems like there is some dependency issues coming from the install instructions provided.
But my GPU now works and it tracking at a rate of 1 frame per ~20 ms :)
π Hello there! We wanted to give you a friendly reminder that this issue has not had any recent activity and may be closed soon, but don't worry - you can always reopen it if needed. If you still have any questions or concerns, please feel free to let us know how we can help.
For additional resources and information, please see the links below:
Feel free to inform us of any other issues you discover or feature requests that come to mind in the future. Pull Requests (PRs) are also always welcomed!
Thank you for your contributions to YOLO π and Vision AI β
i have the same issue if use torchvision no cuda. i fixxed it by: cuda==11.7 torch==2.0.0 torchvision-cu==0.15.0
@bbq52 it's great to hear that you managed to resolve the issue by aligning the versions of CUDA, PyTorch, and torchvision for compatibility. Version conflicts between CUDA, PyTorch, and torchvision can often lead to issues like the one you experienced, so ensuring they are compatible is important for stable operation.
Remember to always consult the PyTorch website or release notes for compatible versions of these libraries when setting up your environment or upgrading packages. This attention to detail can help to minimize version-related issues and ensure smooth operation of your AI models. If you encounter any other issues or have further questions about YOLOv8, feel free to reach out.
I just had a very similar issue. My problem was that PyTorch's pip install command: pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121
does not install torchvision with CUDA.
This is how my pip freeze
looked like (note the missing '+cu121'):
torch==2.1.0+cu121
torchaudio==2.1.0+cu121
torchvision==0.16.0
So after executing the command above, I had to explicitly install torchvision+cuda using pip install torchvision==0.16.0+cu121 -f https://download.pytorch.org/whl/torch_stable.html
.
Here's where I got the idea from: https://discuss.pytorch.org/t/notimplementederror-could-not-run-torchvision-nms-with-arguments-from-the-cuda-backend-this-could-be-because-the-operator-doesnt-exist-for-this-backend/132352
Afterwards, my pip freeze
looks as follows and I no longer have the error:
torch==2.1.0+cu121
torchaudio==2.1.0+cu121
torchvision==0.16.0+cu121
Make sure you install the correct version for your case. Hope this helps!
@NaufalGhifari thank you for detailing the steps you took to solve the issue. It's important for users to verify whether torchvision is installed with CUDA support, as this can lead to the NotImplementedError
when CUDA-based operations like non-maximum suppression are called.
Your solution to install the CUDA-specific version of torchvision after your initial setup is a good reminder that sometimes you need to manually ensure that compatible CUDA versions are explicitly installed.
For those reading this and needing to resolve similar issues: always check the versions of torch, torchvision, and torchaudio to ensure they match your CUDA version. The PyTorch website provides a handy tool for generating the correct pip or conda install commands based on your specific environment settings, such as OS, package manager, Python version, and CUDA version.
Remember that maintaining alignment between the versions of these libraries is crucial for the proper functioning of CUDA-dependent PyTorch functionalities, such as those employed in YOLOv8. Keep an eye on the compatibility to avoid runtime errors and to utilize your GPU resources effectively. Your experience will certainly help others encountering the same problem.
How I fixed this issue
I just had a very similar issue. My problem was that PyTorch's pip install command:
pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121
does not install torchvision with CUDA.This is how my
pip freeze
looked like (note the missing '+cu121'): torch==2.1.0+cu121 torchaudio==2.1.0+cu121 torchvision==0.16.0So after executing the command above, I had to explicitly install torchvision+cuda using
pip install torchvision==0.16.0+cu121 -f https://download.pytorch.org/whl/torch_stable.html
.Here's where I got the idea from: https://discuss.pytorch.org/t/notimplementederror-could-not-run-torchvision-nms-with-arguments-from-the-cuda-backend-this-could-be-because-the-operator-doesnt-exist-for-this-backend/132352
Afterwards, my
pip freeze
looks as follows and I no longer have the error: torch==2.1.0+cu121 torchaudio==2.1.0+cu121 torchvision==0.16.0+cu121Make sure you install the correct version for your case. Hope this helps!
Thank you! You are a life saver! This answer fixed my issue.
@doubtfire009 we're thrilled to hear you got your issue resolved! π Matching versions of torch, torchvision, and CUDA can indeed be tricky, but it looks like you navigated it perfectly. If you or anyone else runs into similar hiccups, remember to check those version alignments closely. Also, the PyTorch discussion forums and documentation are gold mines for troubleshooting these sorts of issues. Thanks for sharing your fix with the community, and happy coding with YOLOv8! π
Search before asking
YOLOv8 Component
Detection
Bug
Environment
Minimal Reproducible Example
Additional
No response
Are you willing to submit a PR?