Hang issue in Tesla T4 GPU

I am able to build .so file for deepstream_yolo, but on runningdeepstream-app, I am getting hang issue. $ deepstream-app -c deepstream_app_config_yolo.txt

I am getting this log:

`WARNING: [TRT]: CUDA lazy loading is not enabled. Enabling it can significantly reduce device memory usage. See `CUDA_MODULE_LOADING` in https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#env-vars
WARNING: ../nvdsinfer/nvdsinfer_model_builder.cpp:1487 Deserialize engine failed because file path: /mnt/home/nawin.ks/Model_inference/deepstream_yolo_configTry/yolov4_-1_3_416_416_nms_dynamic.onnx_b16_gpu0_fp16.engine open error
0:00:03.335773476  6275 0x55a15ac71300 WARN                 nvinfer gstnvinfer.cpp:677:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Warning from NvDsInferContextImpl::deserializeEngineAndBackend() <nvdsinfer_context_impl.cpp:1897> [UID = 1]: deserialize engine from file :/mnt/home/nawin.ks/Model_inference/deepstream_yolo_configTry/yolov4_-1_3_416_416_nms_dynamic.onnx_b16_gpu0_fp16.engine failed
0:00:03.384774338  6275 0x55a15ac71300 WARN                 nvinfer gstnvinfer.cpp:677:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Warning from NvDsInferContextImpl::generateBackendContext() <nvdsinfer_context_impl.cpp:2002> [UID = 1]: deserialize backend context from engine from file :/mnt/home/nawin.ks/Model_inference/deepstream_yolo_configTry/yolov4_-1_3_416_416_nms_dynamic.onnx_b16_gpu0_fp16.engine failed, try rebuild
0:00:03.384804720  6275 0x55a15ac71300 INFO                 nvinfer gstnvinfer.cpp:680:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::buildModel() <nvdsinfer_context_impl.cpp:1923> [UID = 1]: Trying to create engine from model files
WARNING: [TRT]: CUDA lazy loading is not enabled. Enabling it can significantly reduce device memory usage. See `CUDA_MODULE_LOADING` in https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#env-vars
WARNING: [TRT]: onnx2trt_utils.cpp:377: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
WARNING: [TRT]: onnx2trt_utils.cpp:403: One or more weights outside the range of INT32 was clamped
WARNING: [TRT]: onnx2trt_utils.cpp:403: One or more weights outside the range of INT32 was clamped
WARNING: [TRT]: onnx2trt_utils.cpp:403: One or more weights outside the range of INT32 was clamped
WARNING: [TRT]: onnx2trt_utils.cpp:403: One or more weights outside the range of INT32 was clamped
WARNING: [TRT]: onnx2trt_utils.cpp:403: One or more weights outside the range of INT32 was clamped
WARNING: [TRT]: onnx2trt_utils.cpp:403: One or more weights outside the range of INT32 was clamped
WARNING: [TRT]: builtin_op_importers.cpp:5245: Attribute scoreBits not found in plugin node! Ensure that the plugin creator has a default value defined or the engine may fail to build.
WARNING: [TRT]: builtin_op_importers.cpp:5245: Attribute caffeSemantics not found in plugin node! Ensure that the plugin creator has a default value defined or the engine may fail to build.
WARNING: [TRT]: Using PreviewFeature::kFASTER_DYNAMIC_SHAPES_0805 can help improve performance and resolve potential functional issues.
WARNING: [TRT]: Using PreviewFeature::kFASTER_DYNAMIC_SHAPES_0805 can help improve performance and resolve potential functional issues.
WARNING: [TRT]: TensorRT was linked against cuDNN 8.6.0 but loaded cuDNN 8.2.2`

After last line of log, it does nothing and does not terminate itself.

I am not getting hang issue on replacing YOLOv4 ONNX model to lighter Resnet10 OMMX model and disabling use of libnvdsinfer_custom_impl_Yolo.so and NvDsInferParseCustomYoloV4 function.

Deepstream version: 6.2 Tensorrt version: 2.5.2 Platform: AWS EC2 g4dn.2xlarge

Can you please help me on this?

NVIDIA-AI-IOT / yolo_deepstream

Hang issue in Tesla T4 GPU #44