wang-xinyu / tensorrtx

Implementation of popular deep learning networks with TensorRT network definition API
MIT License
6.85k stars 1.76k forks source link

Trouble to load retinaface engine for RTX 4090 and CUDA 11.8 #1299

Closed barzan-hayati closed 9 months ago

barzan-hayati commented 1 year ago

Env

About this repo

Your problem

[04/26/2023-13:50:21] [W] [TRT] CUDA lazy loading is not enabled. Enabling it can significantly reduce device memory usage. See `CUDA_MODULE_LOADING` in https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#env-vars
[04/26/2023-13:50:21] [W] [TRT] The implicit batch dimension mode has been deprecated. Please create the network with NetworkDefinitionCreationFlag::kEXPLICIT_BATCH flag whenever possible.
Loading weights: ../retinaface.wts
Building engine, please wait for a while...
Build engine successfully!
[04/26/2023-13:51:10] [E] [TRT] 3: [builder.cpp::~Builder::307] Error Code 3: API Usage Error (Parameter check failed at: optimizer/api/builder.cpp::~Builder::307, condition: mObjectCounter.use_count() == 1. Destroying a builder object before destroying objects it created leads to undefined behavior.
)

By setting USE_FP16 I got this warning:

[04/26/2023-14:03:48] [W] [TRT] CUDA lazy loading is not enabled. Enabling it can significantly reduce device memory usage. See `CUDA_MODULE_LOADING` in https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#env-vars
[04/26/2023-14:03:48] [W] [TRT] The implicit batch dimension mode has been deprecated. Please create the network with NetworkDefinitionCreationFlag::kEXPLICIT_BATCH flag whenever possible.
Loading weights: ../retinaface.wts
Building engine, please wait for a while...
[04/26/2023-14:06:10] [W] [TRT] TensorRT encountered issues when converting weights between types and that could affect accuracy.
[04/26/2023-14:06:10] [W] [TRT] If this is not the desired behavior, please modify the weights or retrain with regularization to adjust the magnitude of the weights.
[04/26/2023-14:06:10] [W] [TRT] Check verbose logs for the list of affected weights.
[04/26/2023-14:06:10] [W] [TRT] - 27 weights are affected by this issue: Detected subnormal FP16 values.
[04/26/2023-14:06:10] [W] [TRT] - 1 weights are affected by this issue: Detected values less than smallest positive FP16 subnormal value and converted them to the FP16 minimum subnormalized value.
Build engine successfully!
[04/26/2023-14:06:10] [E] [TRT] 3: [builder.cpp::~Builder::307] Error Code 3: API Usage Error (Parameter check failed at: optimizer/api/builder.cpp::~Builder::307, condition: mObjectCounter.use_count() == 1. Destroying a builder object before destroying objects it created leads to undefined behavior.
)

Meanwhile I could do face detection using these engines.

Now I want to load these engines to deepstream pipeline, what I have done perfectly for cuda 11.4 and tensorrt 8.2 on ubuntu 18.04. But for this specs I receive this error and it seems that pipeline could not load engine correctly:

gstnvtracker: Loading low-level lib at ../../resources/tracker_configs/libnvds_nvmultiobjecttracker.so
gstnvtracker: Batch processing is ON
gstnvtracker: Past frame output is OFF
~~ CLOG[src/modules/ReID/ReID.cpp, loadTRTEngine() @line 418]: Engine file does not exist
[NvMultiObjectTracker] Load engine failed. Create engine again.

!![ERROR] UFF file does not exist
[NvMultiObjectTracker] De-initialized
An exception occurred. UFF file does not exist
gstnvtracker: Failed to initialize tracker context!
gstnvtracker:: Failed to create batch context. Shutting down processing.
 Running... 
gstnvtracker: Loading low-level lib at ../../resources/tracker_configs/libnvds_nvmultiobjecttracker.so
z13228604287 commented 1 year ago

这个警告是由于 CUDA 模块延迟加载未启用导致的。启用它可以显著减少设备内存的使用。CUDA 模块延迟加载是一种选项,它允许在需要时才加载 CUDA 模块,而不是一开始就加载所有模块。这样可以减少设备内存的使用,特别是当有多个 CUDA 应用程序在同一时间运行时。 要启用 CUDA 模块懒加载,可以设置环境变量 CUDA_MODULE_LOADING 为 1。可以在终端中使用以下命令来设置: `#include

include

std::string env_var = "CUDA_MODULE_LOADING=1"; _putenv(env_var.c_str()); `

barzan-hayati commented 1 year ago

std::string env_var = "CUDA_MODULE_LOADING=1"; _putenv(env_var.c_str());

And where I should add these codes? There are multiple functions in calibrator.cpp and retina_mnet.cpp

barzan-hayati commented 1 year ago

#include <cuda_runtime_api.h> #include <cstdlib> std::string env_var = "CUDA_MODULE_LOADING=1"; _putenv(env_var.c_str());

Nothing changes with these commands.

stale[bot] commented 1 year ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

barzan-hayati commented 1 year ago

The problem has not been resolved yet.

Dominic-ZZ commented 1 year ago

Mark, same error.

wang-xinyu commented 11 months ago

Please check if this comment can solve this issue. https://github.com/wang-xinyu/tensorrtx/issues/1310#issuecomment-1722243922

stale[bot] commented 9 months ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.