laugh12321 / TensorRT-YOLO

🚀 你的YOLO部署神器。TensorRT Plugin、CUDA Kernel、CUDA Graphs三管齐下,享受闪电般的推理速度。| Your YOLO Deployment Powerhouse. With the synergy of TensorRT Plugins, CUDA Kernels, and CUDA Graphs, experience lightning-fast inference speeds.
https://github.com/laugh12321/TensorRT-YOLO
GNU General Public License v3.0
638 stars 78 forks source link

[Help]: RuntimeError: Deploy initialization failed! Error: DLL load failed while importing pydeploy #37

Closed pikaqkio closed 4 months ago

pikaqkio commented 4 months ago

CLI和python方式推理都报了相同错误,前面的步骤都是没问题的。

(trtyolo) C:\Yolo\TRT-YOLO\demo\detect>trtyolo infer -e models/yolov8n-0508.engine -i images -o output -l labels.txt --cudaGraph [I] Successfully found necessary library paths: { "cudart": "C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.8\bin", "nvinfer": "C:\Program Files\NVIDIA GPU Computing Toolkit\TensorRT\v8.6.1.6\lib", "cudnn": "C:\Program Files\NVIDIA GPU Computing Toolkit\cuDNN\v8.9.2.26\bin"

} Traceback (most recent call last): File "C:\Users\tang_.conda\envs\trtyolo\lib\site-packages\tensorrt_yolo\c_lib_wrap.py", line 148, in from .libs.pydeploy import * ImportError: DLL load failed while importing pydeploy: 找不到指定的模块。

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "C:\Users\tang_.conda\envs\trtyolo\lib\runpy.py", line 196, in _run_module_as_main return _run_code(code, mainglobals, None, File "C:\Users\tang.conda\envs\trtyolo\lib\runpy.py", line 86, in _run_code exec(code, runglobals) File "C:\Users\tang.conda\envs\trtyolo\Scripts\trtyolo.exe__main.py", line 7, in File "C:\Users\tang_.conda\envs\trtyolo\lib\site-packages\rich_click\richcommand.py", line 367, in call return super().call(*args, **kwargs) File "C:\Users\tang.conda\envs\trtyolo\lib\site-packages\click\core.py", line 1157, in call return self.main(*args, kwargs) File "C:\Users\tang_.conda\envs\trtyolo\lib\site-packages\rich_click\richcommand.py", line 152, in main rv = self.invoke(ctx) File "C:\Users\tang.conda\envs\trtyolo\lib\site-packages\click\core.py", line 1688, in invoke return _process_result(sub_ctx.command.invoke(subctx)) File "C:\Users\tang.conda\envs\trtyolo\lib\site-packages\click\core.py", line 1434, in invoke return ctx.invoke(self.callback, ctx.params) File "C:\Users\tang_.conda\envs\trtyolo\lib\site-packages\click\core.py", line 783, in invoke return callback(*args, **kwargs) File "C:\Users\tang_.conda\envs\trtyolo\lib\site-packages\tensorrt_yolo\cli.py", line 151, in infer from .infer import CpuTimer, DeployCGDet, DeployDet, GpuTimer, generate_labels_with_colors, visualizedetections File "C:\Users\tang.conda\envs\trtyolo\lib\site-packages\tensorrtyolo\infer__init__.py", line 1, in from .detection import Box, DeployCGDet, DeployDet, DetectionResult File "C:\Users\tang.conda\envs\trtyolo\lib\site-packages\tensorrt_yolo\infer\detection.py", line 27, in from .. import c_libwrap as C File "C:\Users\tang.conda\envs\trtyolo\lib\site-packages\tensorrt_yolo\c_lib_wrap.py", line 150, in raise RuntimeError(f"Deploy initialization failed! Error: {e}") RuntimeError: Deploy initialization failed! Error: DLL load failed while importing pydeploy: 找不到指定的模块。

(trtyolo) C:\Yolo\TRT-YOLO\demo\detect>python detect.py -e models/yolov8n-0508.engine -i images -o output -l labels.txt --cudaGraph [I] Successfully found necessary library paths: { "cudart": "C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.8\bin", "nvinfer": "C:\Program Files\NVIDIA GPU Computing Toolkit\TensorRT\v8.6.1.6\lib", "cudnn": "C:\Program Files\NVIDIA GPU Computing Toolkit\cuDNN\v8.9.2.26\bin" } Traceback (most recent call last): File "C:\Users\tang_.conda\envs\trtyolo\lib\site-packages\tensorrt_yolo\c_lib_wrap.py", line 148, in from .libs.pydeploy import * ImportError: DLL load failed while importing pydeploy: 找不到指定的模块。

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "C:\Yolo\TRT-YOLO\demo\detect\detect.py", line 114, in main() File "C:\Users\tang.conda\envs\trtyolo\lib\site-packages\click\core.py", line 1157, in call return self.main(*args, **kwargs) File "C:\Users\tang.conda\envs\trtyolo\lib\site-packages\rich_click\richcommand.py", line 152, in main rv = self.invoke(ctx) File "C:\Users\tang.conda\envs\trtyolo\lib\site-packages\click\core.py", line 1434, in invoke return ctx.invoke(self.callback, *ctx.params) File "C:\Users\tang_.conda\envs\trtyolo\lib\site-packages\click\core.py", line 783, in invoke return __callback(args, **kwargs) File "C:\Yolo\TRT-YOLO\demo\detect\detect.py", line 71, in main from tensorrt_yolo.infer import CpuTimer, DeployCGDet, DeployDet, GpuTimer, generate_labels_with_colors, visualizedetections File "C:\Users\tang.conda\envs\trtyolo\lib\site-packages\tensorrtyolo\infer__init__.py", line 1, in from .detection import Box, DeployCGDet, DeployDet, DetectionResult File "C:\Users\tang.conda\envs\trtyolo\lib\site-packages\tensorrt_yolo\infer\detection.py", line 27, in from .. import c_libwrap as C File "C:\Users\tang.conda\envs\trtyolo\lib\site-packages\tensorrt_yolo\c_lib_wrap.py", line 150, in raise RuntimeError(f"Deploy initialization failed! Error: {e}") RuntimeError: Deploy initialization failed! Error: DLL load failed while importing pydeploy: 找不到指定的模块。

(trtyolo) C:\Yolo\TRT-YOLO\demo\detect>

laugh12321 commented 4 months ago

默认是编译并打包成python10的whl,你看一下trtyolo虚拟环境的版本是python10吗?

pikaqkio commented 4 months ago

默认是编译并打包成python10的whl,你看一下trtyolo虚拟环境的版本是python10吗?

(trtyolo) C:\Windows\system32>python --version
Python 3.10.6
laugh12321 commented 4 months ago

@pikaqkio 现在好了吗?

pikaqkio commented 4 months ago

@pikaqkio 现在好了吗?

之前系统同时安装了CUDA v11.8和v12.1,即使在系统的PATH环境变量中将CUDA v11.8路径置顶也不行。 现在只保留CUDA 12.1就可以了,不知道是不是这个原因导致的。 还有顺利提一嘴,我安装cuDNN时候没有zlibwapi也是可以的(原因是没找到这个文件的下载。。) 感谢指导!

(trtyolo) C:\Yolo\TensorRT-YOLO\demo\detect>trtyolo infer -e models/yolov8n-0508.engine -i images -o output -l labels.txt --cudaGraph
[I] Successfully found necessary library paths:
{
    "cudart": "C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.1\\bin",
    "nvinfer": "C:\\Program Files\\NVIDIA GPU Computing Toolkit\\TensorRT\\v8.6.1.6\\lib",
    "cudnn": "C:\\Program Files\\NVIDIA GPU Computing Toolkit\\cnDNN\\v8.9.7.29\\bin"
}
[I] Infering data in images
Processing batches ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 0:00:02
[S] Benchmark results include time for H2D and D2H memory copies, preprocessing, and postprocessing.
    CPU Average Latency: 1.046 ms
    GPU Average Latency: 1.029 ms
    Finished Inference.

(trtyolo) C:\Yolo\TensorRT-YOLO\demo\detect>
laugh12321 commented 4 months ago

感谢反馈,大概率是多版本CUDA导致的问题。zlib在这个项目上是用不上的,但是有时使用torch训练却是必须的。

pikaqkio commented 4 months ago

感谢反馈,大概率是多版本CUDA导致的问题。zlib在这个项目上是用不上的,但是有时使用torch训练却是必须的。

题外话,咨询下大佬cv2.waitKey的问题,在OpenCV中,cv2.waitKey()的最小间隔时间是16ms吗,我发现哪怕是cv2.waitKey(1)最终显示的fps也无法超过60fps。。

laugh12321 commented 4 months ago

关于 cv2.waitKey 的问题你可以参考这个回答。https://github.com/laugh12321/TensorRT-YOLO/issues/19#issuecomment-2085189425