Closed lida2003 closed 3 weeks ago
π Hello @lida2003, thank you for your interest in YOLOv5 π! Please refer to our tutorials to get started, where you can find quickstart guides for simple tasks like custom data training and advanced concepts like hyperparameter evolution.
Since this is a π Bug Report, could you please provide a minimum reproducible example to help us debug the issue? If this is related to pytorch or torchvision compatibility, please ensure that your versions are compatible as this might be the source of the issue.
If this is a custom training β Question, please provide as much information as possible, including dataset image examples and training logs, and verify you are following our tips for best training results.
Make sure you have Python>=3.8.0 with all required packages including PyTorch>=1.8 installed. To get started, consider cloning the YOLOv5 repository and installing the required packages.
YOLOv5 can be run in various environments such as online notebooks with free GPUs, Google Cloud Deep Learning VM, Amazon Deep Learning AMI, and Docker Image. Choose the environment that best fits your setup.
If the YOLOv5 Continuous Integration tests are passing, it ensures the repository is working correctly across different systems. CI tests verify correct operation of YOLOv5 training, validation, inference, export, and benchmarks on macOS, Windows, and Ubuntu.
Stay tuned, as an Ultralytics engineer will assist you soon! Feel free to check our latest state-of-the-art model, YOLOv8, which promises improved performance and is easy to use. Happy coding! π
Below version has been tested, all failed with same log.
BTW, PyTorch is a binary release from NVIDIA, which is confirmed by NVIDIA in previous discussion link to nvidia forum.
daniel@daniel-nvidia:~/Work/yolov5$ python --version
Python 3.8.10
daniel@daniel-nvidia:~/Work/yolov5$ python -c "import torch; import torchvision; print(f'PyTorch version: {torch.__version__}'); print(f'Torchvision version: {torchvision.__version__}')"
/home/daniel/.local/lib/python3.8/site-packages/torchvision/io/image.py:13: UserWarning: Failed to load image Python extension: '/home/daniel/.local/lib/python3.8/site-packages/torchvision/image.so: undefined symbol: _ZN5torch3jit17parseSchemaOrNameERKSsb'If you don't plan on using image functionality from `torchvision.io`, you can ignore this warning. Otherwise, there might be something wrong with your environment. Did you have `libjpeg` or `libpng` installed before building `torchvision` from source?
warn(
PyTorch version: 2.1.0a0+41361538.nv23.06
Torchvision version: 0.16.0
Please ensure that your PyTorch and torchvision versions are compatible. You might want to try reinstalling torchvision to match your PyTorch version. If the issue persists, consider testing with the latest stable releases of both packages.
Please ensure that your PyTorch and torchvision versions are compatible.
Following https://docs.ultralytics.com/guides/nvidia-jetson/#install-pytorch-and-torchvision_1 , it's compatible. But it can't work.
You might want to try reinstalling torchvision to match your PyTorch version.
Yes, I think I have test as much as I can do.
If the issue persists, consider testing with the latest stable releases of both packages.
NVIDIA JetPack 5.1.3 python 3.8.10, and the latest version It suggests to use 3.10 version. So I think I would like to keep NVIDIA product JetPack 5.1.3 version for now.
The question is why do as guide suggest, it went wrong? Is there any extra software I have to install?
It seems like there might be an issue with the NVIDIA-specific PyTorch build. Ensure all dependencies like libjpeg
and libpng
are installed before building torchvision. You might also try using a virtual environment to isolate the setup. If the problem continues, consider reaching out to NVIDIA support for further assistance with their specific PyTorch release.
It seems like there might be an issue with the NVIDIA-specific PyTorch build.
Well, I did ask them about the binary build and did as they told me to install the binary first then build torchvison, detailed info see previous link: Pytorch & torchversion compatible issue on L4T35.5.0
Ensure all dependencies like
libjpeg
andlibpng
are installed before building torchvision.
Yes, it's all installed.
You might also try using a virtual environment to isolate the setup. If the problem continues, consider reaching out to NVIDIA support for further assistance with their specific PyTorch release.
virtual environment setup met issue when run those scripts. So I would foucus mainly on real setup and it takes less CPU/Memoray which is the real deployment of the software.
Can you let me know what "custom C++ ops" is?
@pderrenger
BTW should I install libpng++-dev? Please check below installation for libjpg(libjpeg-dev) and libpng(libpng-dev).
daniel@daniel-nvidia:~$ aptitude search libjpeg
v libjpeg-dbg -
i libjpeg-dev - Independent JPEG Group's JPEG runtime library (dependency package)
p libjpeg-progs - Programs for manipulating JPEG files
p libjpeg-tools - Complete implementation of 10918-1 (JPEG)
p libjpeg-turbo-progs - Programs for manipulating JPEG files
p libjpeg-turbo-test - Program for benchmarking and testing libjpeg-turbo
i A libjpeg-turbo8 - IJG JPEG compliant runtime library.
p libjpeg-turbo8-dbg - Debugging symbols for the libjpeg-turbo library
i libjpeg-turbo8-dev - Development files for the IJG JPEG library
p libjpeg62 - Independent JPEG Group's JPEG runtime library (version 6.2)
p libjpeg62-dev - Development files for the IJG JPEG library (version 6.2)
i A libjpeg8 - Independent JPEG Group's JPEG runtime library (dependency package)
p libjpeg8-dbg - Independent JPEG Group's JPEG runtime library (dependency package)
i libjpeg8-dev - Independent JPEG Group's JPEG runtime library (dependency package)
p libjpeg9 - Independent JPEG Group's JPEG runtime library
p libjpeg9-dev - Development files for the IJG JPEG library
daniel@daniel-nvidia:~$ aptitude search libpng
p libpng++-dev - C++ interface to the PNG (Portable Network Graphics) library
i libpng-dev - PNG library - development (version 1.6)
p libpng-sixlegs-java - Sixlegs Java PNG Decoder
p libpng-sixlegs-java-doc - Documentation for Sixlegs Java PNG Decoder
p libpng-tools - PNG library - tools (version 1.6)
i A libpng16-16 - PNG library - runtime (version 1.6)
p libpnglite-dev - lightweight C library for loading and writing PNG images
p libpnglite0 - lightweight C library for loading and writing PNG images
You don't need to install libpng++-dev
specifically. Having libjpeg-dev
and libpng-dev
should be sufficient for building torchvision. If issues persist, ensure all dependencies are correctly installed and consider testing with the latest package versions.
$ pip uninstall torchvision
Found existing installation: torchvision 0.16.0
Uninstalling torchvision-0.16.0:
Would remove:
/home/daniel/.local/lib/python3.8/site-packages/torchvision-0.16.0.dist-info/*
/home/daniel/.local/lib/python3.8/site-packages/torchvision/*
Proceed (Y/n)? Y
Successfully uninstalled torchvision-0.16.0
daniel@daniel-nvidia:~/Work/torchvision$ pip install .
Defaulting to user installation because normal site-packages is not writeable
Looking in indexes: https://pypi.org/simple, https://pypi.ngc.nvidia.com
Processing /home/daniel/Work/torchvision
Installing build dependencies ... -^[[C^[[C^[[C^[[done
Getting requirements to build wheel ... done
Preparing metadata (pyproject.toml) ... done
Requirement already satisfied: numpy in /home/daniel/.local/lib/python3.8/site-packages (from torchvision==0.20.0a0+945bdad) (1. 23.5)
Requirement already satisfied: torch in /home/daniel/.local/lib/python3.8/site-packages (from torchvision==0.20.0a0+945bdad) (2. 1.0a0+41361538.nv23.6)
Requirement already satisfied: pillow!=8.3.*,>=5.3.0 in /home/daniel/.local/lib/python3.8/site-packages (from torchvision==0.20. 0a0+945bdad) (10.4.0)
Requirement already satisfied: filelock in /usr/lib/python3/dist-packages (from torch->torchvision==0.20.0a0+945bdad) (3.0.12)
Requirement already satisfied: fsspec in /home/daniel/.local/lib/python3.8/site-packages (from torch->torchvision==0.20.0a0+945b dad) (2024.10.0)
Requirement already satisfied: jinja2 in /usr/lib/python3/dist-packages (from torch->torchvision==0.20.0a0+945bdad) (2.10.1)
Requirement already satisfied: networkx in /home/daniel/.local/lib/python3.8/site-packages (from torch->torchvision==0.20.0a0+94 5bdad) (3.1)
Requirement already satisfied: sympy in /home/daniel/.local/lib/python3.8/site-packages (from torch->torchvision==0.20.0a0+945bd ad) (1.13.3)
Requirement already satisfied: typing-extensions in /home/daniel/.local/lib/python3.8/site-packages (from torch->torchvision==0. 20.0a0+945bdad) (4.12.2)
Requirement already satisfied: mpmath<1.4,>=1.1.0 in /home/daniel/.local/lib/python3.8/site-packages (from sympy->torch->torchvi sion==0.20.0a0+945bdad) (1.3.0)
Building wheels for collected packages: torchvision
Building wheel for torchvision (pyproject.toml) ... done
Created wheel for torchvision: filename=torchvision-0.20.0a0+945bdad-cp38-cp38-linux_aarch64.whl size=1196790 sha256=294124bf5 972ee4c9085f43352ed80b94772979d82970f5319e3d0c403967313
Stored in directory: /tmp/pip-ephem-wheel-cache-dbitv4r7/wheels/12/f2/fd/9a2cd59f45fe55f3ec87a661481722bd68e804b4c7a21bceca
Successfully built torchvision
Installing collected packages: torchvision
Successfully installed torchvision-0.20.0a0+945bdad
Can you let me know what "custom C++ ops" or "torch._custom_ops" is?
daniel@daniel-nvidia:~/Work$ yolo track model=yolov8n.engine source=../Videos/Worlds_longest_drone_fpv_one_shot.mp4
WARNING β οΈ torchvision==0.20 is incompatible with torch==2.1.
Run 'pip install torchvision==0.16' to fix torchvision or 'pip install -U torch torchvision' to update both.
For a full compatibility table see https://github.com/pytorch/vision#installation
WARNING β οΈ Python>=3.10 is required, but Python==3.8.10 is currently installed
WARNING β οΈ Unable to automatically guess model task, assuming 'task=detect'. Explicitly define task for your model, i.e. 'task=detect', 'segment', 'classify','pose' or 'obb'.
Ultralytics 8.3.21 π Python-3.8.10 torch-2.1.0a0+41361538.nv23.06 CUDA:0 (Orin, 7451MiB)
Loading yolov8n.engine for TensorRT inference...
[10/30/2024-20:36:49] [TRT] [I] Loaded engine size: 13 MiB
[10/30/2024-20:36:49] [TRT] [W] Using an engine plan file across different models of devices is not recommended and is likely to affect performance or even cause errors.
[10/30/2024-20:36:50] [TRT] [I] [MemUsageChange] Init cuDNN: CPU +616, GPU +583, now: CPU 1003, GPU 3992 (MiB)
[10/30/2024-20:36:50] [TRT] [I] [MemUsageChange] TensorRT-managed allocation in engine deserialization: CPU +0, GPU +12, now: CPU 0, GPU 12 (MiB)
[10/30/2024-20:36:50] [TRT] [I] [MemUsageChange] Init cuDNN: CPU +0, GPU +0, now: CPU 990, GPU 3981 (MiB)
[10/30/2024-20:36:50] [TRT] [I] [MemUsageChange] TensorRT-managed allocation in IExecutionContext creation: CPU +0, GPU +18, now: CPU 0, GPU 30 (MiB)
Traceback (most recent call last):
File "/home/daniel/.local/bin/yolo", line 8, in <module>
sys.exit(entrypoint())
File "/home/daniel/.local/lib/python3.8/site-packages/ultralytics/cfg/__init__.py", line 824, in entrypoint
getattr(model, mode)(**overrides) # default args from model
File "/home/daniel/.local/lib/python3.8/site-packages/ultralytics/engine/model.py", line 601, in track
return self.predict(source=source, stream=stream, **kwargs)
File "/home/daniel/.local/lib/python3.8/site-packages/ultralytics/engine/model.py", line 554, in predict
return self.predictor.predict_cli(source=source) if is_cli else self.predictor(source=source, stream=stream)
File "/home/daniel/.local/lib/python3.8/site-packages/ultralytics/engine/predictor.py", line 183, in predict_cli
for _ in gen: # sourcery skip: remove-empty-nested-block, noqa
File "/home/daniel/.local/lib/python3.8/site-packages/torch/utils/_contextlib.py", line 35, in generator_context
response = gen.send(None)
File "/home/daniel/.local/lib/python3.8/site-packages/ultralytics/engine/predictor.py", line 234, in stream_inference
self.model.warmup(imgsz=(1 if self.model.pt or self.model.triton else self.dataset.bs, 3, *self.imgsz))
File "/home/daniel/.local/lib/python3.8/site-packages/ultralytics/nn/autobackend.py", line 642, in warmup
import torchvision # noqa (import here so torchvision import time not recorded in postprocess time)
File "/home/daniel/.local/lib/python3.8/site-packages/torchvision/__init__.py", line 10, in <module>
from torchvision import _meta_registrations, datasets, io, models, ops, transforms, utils # usort:skip
File "/home/daniel/.local/lib/python3.8/site-packages/torchvision/_meta_registrations.py", line 4, in <module>
import torch._custom_ops
ModuleNotFoundError: No module named 'torch._custom_ops'
daniel@daniel-nvidia:~/Work$ cd yolov5/
daniel@daniel-nvidia:~/Work/yolov5$ python detect.py --weights yolov5s.pt --source ../Videos/Worlds_longest_drone_fpv_one_shot.mp4
WARNING β οΈ torchvision==0.20 is incompatible with torch==2.1.
Run 'pip install torchvision==0.16' to fix torchvision or 'pip install -U torch torchvision' to update both.
For a full compatibility table see https://github.com/pytorch/vision#installation
WARNING β οΈ Python>=3.10 is required, but Python==3.8.10 is currently installed
Traceback (most recent call last):
File "detect.py", line 48, in <module>
from models.common import DetectMultiBackend
File "/home/daniel/Work/yolov5/models/common.py", line 39, in <module>
from utils.dataloaders import exif_transpose, letterbox
File "/home/daniel/Work/yolov5/utils/dataloaders.py", line 23, in <module>
import torchvision
File "/home/daniel/.local/lib/python3.8/site-packages/torchvision/__init__.py", line 10, in <module>
from torchvision import _meta_registrations, datasets, io, models, ops, transforms, utils # usort:skip
File "/home/daniel/.local/lib/python3.8/site-packages/torchvision/_meta_registrations.py", line 4, in <module>
import torch._custom_ops
ModuleNotFoundError: No module named 'torch._custom_ops'
The "custom C++ ops" or "torch._custom_ops" refers to custom operations implemented in C++ for PyTorch. These are typically used for performance optimization. The error indicates a missing module, likely due to version incompatibility. Please ensure you are using compatible versions of PyTorch and torchvision as per the compatibility matrix provided by PyTorch.
The error indicates a missing module, likely due to version incompatibility.
I didn't know which module is missing or version is incompatible. I didn't figured out why the proven good version from NVIDIA can't work. Any debug options can locate the missing module name, or version incompatible issue?
Please ensure you are using compatible versions of PyTorch and torchvision as per the compatibility matrix provided by PyTorch.
Now I have tried following version, any compatible issue?
It seems you've tried several combinations. To debug further, check the compatibility matrix on the PyTorch GitHub page to ensure the versions align. If issues persist, consider testing with the latest stable releases of both PyTorch and torchvision.
@pderrenger
Thank you for reaching out. Unfortunately, we don't have direct contact with NVIDIA. For issues related to their specific PyTorch builds, I recommend continuing discussions on the NVIDIA forums or contacting their support for assistance.
OK, Thanks for you time. Hope NVIDIA will support this production version to help me find out what's going on there.
You're welcome. I recommend continuing to engage with NVIDIA support for further assistance on this issue. If you have any other questions related to YOLOv5, feel free to ask.
Search before asking
Question
I followed readme and all seems fine:
When I run with object detection command, I got "RuntimeError: Couldn't load custom C++ ops".
I don't know why. Further discussion here with NVIDIA.
Additional
No response